Sharing and collaboration is key to what makes using DataRobot across teams easy. We also want to ensure you have the most granular control over how to share and who you invite in to collaborate with you, which is why we have decided to make an important change to our sharing policy within the DataRobot platform.
We’re excited to share that starting Wednesday August 9th, we’ll begin our journey to a new era in the DataRobot Community! We’ve been hard at work reviewing feedback, talking with users and discussing amongst our teams to inform an increased commitment in our Community experience.
DataRobot is retiring the App2 SaaS offering that enables anyone to sign up and use the DataRobot platform for free. The DataRobot team is working on a new, more comprehensive experience for customers to explore the capabilities of the DataRobot platform, which will be shared soon. Enterprise buyers can experiencea DataRobot Tourand/or reach out fora proof of value conversationwith their account team.
As soon as June 30, 2022, DataRobot will disable App2. Once App2 is disabled, you won’t be able to log into an existing account or create a new account.
Download any important assets (datasets, model artifacts, scoring code, etc.) from your App2 account before June 30.
If you are a paying App2 user, your DataRobot success team will reach out to you and work with you to develop a tailored transition plan.
Feel free to reply to this post (clickReplybelow), or reach out to your DataRobot success team at any time regarding this change.
This webinar explores how automated Machine Learning (AutoML) can combine with Natural Language Processing (NLP) to open up new possibilities for analyzing, categorizing and deriving value from text documents—no complex skills or theoretical knowledge required.
Accelerate the delivery of AI to production through a hosted implementation of the DataRobot AI Cloud platform
Dedicated Managed AI Cloud is a full instance of the latest DataRobot platform release, hosted by DataRobot. By eliminating implementation time and resources, organizations can more quickly apply machine learning, decision intelligence, and MLOps capabilities.
Register and attend the next "Ask the Expert" session: Improving Time Series Models.
In this session, Travis and Calvin will provide a deep dive into how DataRobot can supercharge your forecasting needs. They will show how to achieve better model outcomes--and your business objectives--in the following topics:
How to use clustering/segmenting series for better model performance
How to leverage hierarchical modeling to extract more signal from your data
Strategies to explore feature derivation windows and backtesting length
This session is relevant for any user who has a basic familiarity with data science and the DataRobot platform and is interested in improving their time series models! It does not require advanced knowledge of the platform or of time series modeling.
We're announcing a new user series called Community Live Forums. This is your chance to get time with a DataRobot Data Scientist. There will be two types of live, virtual events, DataRobot Live! and Ask the Expert:
DataRobot Live! -- Are you just getting started with DataRobot and want to learn more? These sessions will be a short high level demo with the majority of the session dedicated to answering user questions in a hands-on environment. Learn how DataRobot can automate the entire end to end process of preparing, building, and deploying highly accurate models to power your modern AI systems. This is for users who are currently in a trial and customers.
Ask the Expert -- Are you a regular user and want to ask questions about specific functionalities? These sessions will have topics for a targeted audience, deep diving around your pre-posted questions from the community. To get the most from this session, it is best to have access to the product.
These sessions will be hosted most Tuesdays at 11:00 AM ET. Users can attend as many sessions as they want. You can register here.
Register and attend the next "Ask the Expert" session: Build a Churn Use Case.
In this session, @jake and @Olga Shpyrko will provide a deep dive into how to predict which customers will churn and how to use machine learning to choose the optimal retention strategies. They will show how critical it is to frame the churn problem appropriately, give an example with DataRobot, then discuss how to manage the use case in production and how to measure its success!
This session is relevant for any user who has a basic familiarity with data science and the DataRobot platform and is interested in building a churn model! It does not require advanced knowledge of the platform or of churn modeling.
Thanks to everyone who joined us for our first DataRobot Live! We reviewed the model capabilities of our platform as well as a demo on building production ready models. A transcript of the Q&A is outlined below.
Q1:You mentioned Pathfinder [has] regular use cases, where can I find this?
A1: Pathfinder can be found at pathfinder.datarobot.com. This includes many use case ideas from common industries, some of which are fleshed out to include business considerations and implementations, and even notebook solutions!
Q2: I am new to DataRobot and I am working with a Time Series dataset that only uses weekdays, is there a way to include weekends as well?
A2: DataRobot has a Time Series DataPrep tool that allows you to aggregate to the weekly level easily if you think weekly predictions are granular enough. If you want to stick to daily, you can include those weekends with zero value outcomes, and DataRobot will automatically generate indicators for each day of the week. It will quickly learn that there are zero-value outcomes on weekends.
Q3: Is there a common way to analyze just weekdays? I know that there are ways to build SQL code to use the Time Series stuff with gaps.
A3: Following up on a similar question here, you can access time series data prep to further clean your time series data (or use the recommendation provided above). Learn more here.
Q4: I have seen many different examples/ iterations of Time Series. My current example is to predict One Day Ahead. Any examples that can be shared on just that alone?
A4: As shared live, when you build time series models with DataRobot's AutoTS, you can choose the "forecast window". This could be just the next day, or the next week, or between 4-7 days away from the prediction time. There is a lot of flexibility here.
Q5: What are the steps to collect the information based on the past - I think I saw a video of including future dates, but input them as blank, and the program automatically estimates the missing dates, one day ahead.
A5: Time Series (and feature discovery) do automatic feature engineering in which they derive rolling metrics over the past day, week, month, or any custom time frame. To make future predictions, you do need to provide a dataset with blank outcomes in future dates. If you have multiple series, then you need blanks on future dates for each series.
Q6:Does the model retrain itself automatically when it is fed with new data entries?
A6: It won't retrain itself by default, but you can set up automatic retraining jobs in MLOps. You can retrain based on triggers like a decline in model performance, or you can retrain on a schedule. Learn more here.
Q7: What dataset size can be used with DataRobot? (million data points? and max number of features?)
A7: Data size limits are mostly based on file size. Depending on your license and/or install (if on premises), you can model on 5 GB or 10 GB of data. There is also a feature limit of 20,000 features, but the size limit overrides this one. When dealing with big data, we recommend downsampling the data.
If you have any feedback on answers or other questions, please feel free to comment below. You can join us for an upcoming session by registering here.
We're announcing a new user series called Community Live Forums. This is your chance to get time with a DataRobot Data Scientist. There will be two types of live, virtual events:
DataRobot Live! -- These sessions will be a short demo with majority of the session dedicated to answering user questions in a hands-on environment.
Ask the Expert -- These sessions will have targeted topics for a specific audience
These sessions will be hosted most Tuesdays at 11:00 AM ET. Users can attend as many sessions as they want. Our first DataRobot Live! will take place on July 12, 2022 at 11:00AM ET. You can register here.
We will be posting reminders about upcoming forums as well as the upcoming schedule. In the meantime, if you have ideas for sessions topics, we're all ears! Please comment below.
DataRobot is deprecating the use of Python 2 from the platform codebase; this will require nearly all current users to migrate their active projects and model deployments to use the new Python 3 runtime in the platform. Your action is required to migrate projects as well as deployments to ensure business continuity, before these are disabled per the schedule in Managed Cloud or the on-premise version in use.
How do I know if this impacts me or my organization?
This migration affects all users of DataRobot, due to the underlying changes in the Python version used in the platform. To ensure you can oversee and control the changes to your own projects, models and deployments, DataRobot does not automatically execute the migration steps; as a result, user intervention is required.
The following set of users should take action to plan and execute the migration steps:
Managed AI Cloud: All users and organizations using one of our cloud (SaaS) instances and having projects created before March 7, 2022 that leverage Python 2 runtime (new projects started using Python 3 starting on March 7). The Manage Projects page will identify which projects will be deprecated. To ensure your deprecated projects are not disabled in July, follow the preset schedule of migration phases.
AI Cloud Platform Trial: Users who are doing short-term evaluations of the platform, and expect to be using the deprecated projects or deployments associated with these projects after July 25, 2022. The migration steps are to future-proof your work. Otherwise, if you do not intend to continue using these projects and model deployments long-term (after July 25), then you can ignore the migration steps.
On-Premise: All organizations using DataRobot software deployed in their privately-managed cloud or data center, except where a completely new installation (i.e., not an upgrade from a previous DataRobot version) was done on Release 7.1 or higher. An exception is Hadoop-based installations, where the Python migration is necessary regardless of the DataRobot version in use. For such eligible versions and environments, upgrading to DataRobot Release 8.0 is mandatory to execute the migration steps, prior to upgrading to any future Release 9.0 or higher. The timeline for migration is dependent on the current versions in use and upgrade plans for Release 8.0.
Python client: The DataRobot Python API client continues to support Python 2 and at this time we are not deprecating its use, so API client code will not need to be migrated yet. We will provide additional guidance in coming months as we look to deprecate the py2 client code. The project and deployment migration is relevant to all the users mentioned above, irrespective of whether they are using the Python client or not.
How can I get support or have my questions answered?
Existing customers can contact the DataRobot Support team or their representative for further assistance after reviewing the migration guide. DataRobot Community support is also available to all DataRobot users, including Trial and Enterprise. If you have any questions about the migration, ask them here. We hope you’ll encourage further discussions and planning in your organization to execute next steps for migration very soon.
Maintenance for the DataPrep service will be performed on Friday, April 1st, from 4:30AM UTC to 8:00AM UTC. This maintenance will affect both Managed AI Cloud (US and EU) and AI Platform Trial environments.
During this time all users for those environments will experience a brief outage of the DataPrep service.
DataRobot AI Cloud has expanded capabilities designed for all users that enable AI-driven decisions across all lines of business—within a single platform. In the AI Cloud 8.0 Release, we have focused on the following key themes:
Predictive AI Apps
In DataRobot AI Cloud 8.0, we’ve extended our market-leading Time Series capabilities to our AI App Builder. With Automated Time Series, you can create robust, AI-driven forecasts using advanced algorithms, automation, and time-aware guardrails. Then, you can immediately deliver them to your front lines with flexible deployment options that natively embed AI-driven forecasts anywhere in your ecosystem. And now, with our No-Code AI App Builder, you can select any of your deployed Time Series models and build a fully customized AI Application with absolutely no coding required. Within the app, you can compare forecasts with actual values for new data, provide insights on prediction explanations over time, and dig deeply into the reasons driving each forecast. And, thanks to the intuitive interface, quickly and effortlessly share insights drawn with the key decision-makers who need them most.
Optimize for Peak Results
Delivering peak results for AI—while continuously optimizing every model—allows businesses/the business to adapt to changing market dynamics. With DataRobot 8.0, we’re extending our powerful Continuous AI and MLOps capabilities for all environments, even those deployed on premises. DataRobot 8.0 continuously monitors all models in production—keeping every model running at peak performance while adapting for data drift and shifting ethical standards. With the world changing due to challenging triggers—from COVID, to changing economic climates, and more—supporting good governance policies is more important than ever.DataRobot’s Continuous AI capabilities automatically recommend new models to more effectively reach business goals, taking into account changing market conditions and new modeling techniques. Continuous AI automatically adapts and recommends the best solution for your business now, protecting your performance in the future.
With the massive growth in data, every business is struggling with a broad range of data widely distributed across traditional enterprise systems, on-premises environments, Data Clouds, and more. But getting the most accurate picture of your data requires the ability to access the widest range of data, bringing together multiple sources to build a more comprehensive, higher quality, higher fidelity model. By delivering best-in-class connectivity, DataRobot AI Cloud 8.0 is giving every business the ability to work with more types of models, while accelerating the time to value and removing barriers through a complete set of pre-built integrations along with write-back capabilities to the most popular cloud data stores, including Snowflake.
Active Directory Connect with Azure Synapse—To ensure that you have a seamless end-to-end experience, we’ve invested time and resources to build best-in-class connectivity. In the AI Cloud 8.0 Release, we’ve added Active Directory Connect with Azure Synapse. This connector allows you to connect to Azure Synapse Analytics for Library imports and exports. For export, the connector uploads data into Azure’s Data Lake service and then exposes the data as a table in the SQL Data Warehouse.
Scoring Code for Snowflake—Available now, DataRobot Scoring Codesupports execution directly inside of Snowflake using Snowflake’s new Java UDF functionality. This capability removes the need to extract and load data from Snowflake, resulting in a much faster route to scoring large datasets. You can perform predictions on DataRobot models anywhere you want, by exporting a model to a Java file.
These are just some of the major highlights of DataRobot AI Cloud 8.0. For a complete list of new and enhanced features, please visit the DataRobot documentation release center. In addition, be sure to read Nenshad Bardoliwalla’s new blog to learn more about this mission critical innovation. Make sure to join the conversation and ask questions in the DataRobot Community by replying to this post. We’ve created a series of demo videos to help you understand the changes and guide you through many of the new and updated features. (Note that some features are public preview—contact your DataRobot representative or administrator for information on enabling them.)
Note: Features described here as part of Release 8.0 (released 3-17-202) are available for both DataRobot on-premise and Managed AI Cloud users.
How can I help?
If you have questions about the release, please click Reply and send them along so we can fill in any blanks. Looking forward to your feedback!
Maintenance will be performed on Monday, February 21st, 2022, from 5PM UTC to 9PM UTC. This maintenance is limited to our EU Managed AI Cloud. During this time window some DataRobot application services may not be stable and some users may experience a brief service interruption. We expect predictions to operate as normal.
To receive email or SMS updates on all maintenance or incidents, please subscribe to DataRobot Status.
Our team is excited to introduce DataRobot Core, a comprehensive offering that broadens its AI Cloud platform for code-first data science experts.
DataRobot Core brings together a complete portfolio of purpose-built capabilities that give data scientists ultimate flexibility in how they deliver AI to the business, enabling faster experimentation and rapid time to value while making teams more efficient and effective at driving clear business impact from AI:
Platform: Unified environment with first-class, embedded and multilanguage notebook experience; Composable ML to seamlessly pivot between code-first and automated model generation; code-centric pipelines on top of Apache Spark; and an open API to enable programmatic access to the full AI Cloud platform, built for the modern enterprise with support for the reliability, governance, compliance to address the needs of every industry.
Resources: Extensive portfolio of accelerators, third-party integrations, and libraries to expedite AI delivery and drive efficiency, along with evolving education resources to advance skills and enable data scientists to stay at the cutting edge.
Community: Shared knowledge and access to the unique expertise of the DataRobot team, industry experts, and thousands of community members from DataRobot customers and partners, representing some of the largest and most successful AI implementations in the world.
For Data Scientists, By Data Scientists
Designed for how Data Scientists work and built to empower them to best deliver impact for the business. Combined with essential tools and solutions, best practices, and continuous education to support you in this rapidly evolving AI landscape.
Flexibility and Control
Build with your preferred tools and languages. Access powerful code-first notebooks with Python, R, Scala, Spark, and SQL support. Innovate and experiment fast with the DataRobot AI Cloud platform while focusing on delivering unique insights from any data.
Efficiency and Focus
Enable data scientists to focus on key strategic initiatives and eliminate the distractions and time commitments of low-level details to drive insights that create business impact. Focus on data science, not deploying and managing infrastructure. Deliver AI projects with increased velocity by infusing automation into key tasks.
Data science is a team sport, and collaboration is at the core of any great team. DataRobot Core allows you to quickly code your notebooks, collaborate, and share projects among your data science team, and deliver insights to your business stakeholders in a simple way.
Bring new levels of efficiency and power to data scientists all on the fully unified DataRobot AI Cloud platform. Quickly share and handoff projects between teams and easily access capabilities for centralized management and monitoring capabilities, compliance, and continuous optimization.
Product Capabilities For Advanced Data Scientists
We embedded a comprehensive list of product capabilities at every stage of the AI lifecycle to provide you with a seamless experience in one single platform.
Data Prep: Wide range of data engineering best practices, automated with custom-coded pipelines
Access a vast library of tools, accelerators, and third-party integrations to accelerate your delivery of AI to production. Expert-level curriculum from DataRobot University delivers continuous education and skills development to keep you at the forefront of your industry.
I’m reaching out with an update on DataRobot’s response to the Log4j vulnerability.
On December 10, 2021, DataRobot became aware of a vulnerability in the widely used logging library Log4j (CVE-2021-44228) for Java-based applications, which is impacting enterprise applications and cloud services around the world. Since then, (CVE-2021-45046) has been issued and the situation continues to evolve.
In response to the initial and subsequent vulnerabilities, DataRobot immediately assembled a cross-functional team to assess the scope of the vulnerabilities and begin implementing steps for remediation.
Security is a foundational element of an Enterprise AI Platform. The new 7.3 release has shipped with a remediation as will all future releases. Please review the following for more detailed guidance:
If you are using any of the above features, the Log4j vulnerability may continue to exist in any previously generated artifact. As general guidance, please follow the Apache Security Advisory for Log4j for mitigation. As the situation evolves, new updates with new mitigations will be posted by Apache at this link. If you need further details, please review the appendix for specific mitigation steps on DataRobot artifacts to help address these risks.
The DataRobot DataPrep CDH 6 connector is being patched as a priority. Customers using this feature should do so only in secured environments until a patch is applied.
Other DataRobot products, including Zepl and Algorithmia, are not affected.
We would also urge you to make a plan to upgrade to DataRobot 7.3 in your current environment as soon as possible. We are happy to work with you on this upgrade and to enable your users on all of the latest capabilities that your upgrade would give them access to.
Please do not hesitate to reach out to your account team or email firstname.lastname@example.org if we can assist you in any way. As always, thank you for including DataRobot as a cornerstone in your AI transformation. We will provide updates on Log4j on DataRobot Community if we have new information relevant to you. For now, we wish you the happiest of holiday seasons.
Chief Product Officer
Appendix: DataRobot Customer-Managed Release
This vulnerability is dependent on which features are enabled and how they are being utilized.
1. DataRobot Scoring Code (formerly known as "CodeGen")
DataRobot provides a capability to export ‘code’ and executable Jar files for the purpose of running predictions on other platforms.
Scoring Code Jars generated from trained models could be vulnerable. If your runtime environment is not already secured, please follow the current guidance provided in the following Apache Security Advisory.
2. MLOps Monitoring Agent
DataRobot provides a capability to monitor and manage ML models running outside of DataRobot’s platform via the MLOps Monitoring Agent.
If you are running the MLOps Monitoring Agent, and your runtime environment is not already secured, please follow the guidance from Apache Security Advisory.
3. Portable Prediction Server (PPS) with MLOps Monitoring enabled
DataRobot provides a capability to export and execute ML models in an external Docker container outside of DataRobot’s platform and monitor the execution via DataRobot’s MLOps Monitoring Agent (see 2, above). Only the Java MLOps Monitoring Agent contains the vulnerable library.
If your runtime environment is not already secured, please follow the current guidance provided in the following Apache Security Advisory.
4. JDBC Driver Support
DataRobot allows customers to connect to external JDBC data sources. We recommend upgrading any JDBC driver to a release which meets the requirements of the Apache Security Advisory.