Computer-Aided Design (CAD) is the go-to method for designing most of today’s physical products. Engineers use CAD to show 2D sketches into 3D models that they will then test and refine before sending a final version to a production line. However the software is notoriously complicated to learn, with 1000’s of commands to select from. To be truly proficient within the software takes an enormous period of time and practice.
MIT engineers want to ease CAD’s learning curve with an AI model that uses CAD software very like a human would. Given a 2D sketch of an object, the model quickly creates a 3D version by clicking buttons and file options, just like how an engineer would use the software.
The MIT team has created a brand new dataset called VideoCAD, which accommodates greater than 41,000 examples of how 3D models are in-built CAD software. By learning from these videos, which illustrate how different shapes and objects are constructed step-by-step, the brand new AI system can now operate CAD software very like a human user.
With VideoCAD, the team is constructing toward an AI-enabled “CAD co-pilot.” They envision that such a tool couldn’t only create 3D versions of a design, but in addition work with a human user to suggest next steps, or mechanically perform construct sequences that might otherwise be tedious and time-consuming to manually click through.
“There’s a possibility for AI to extend engineers’ productivity in addition to make CAD more accessible to more people,” says Ghadi Nehme, a graduate student in MIT’s Department of Mechanical Engineering.
“This is critical since it lowers the barrier to entry for design, helping people without years of CAD training to create 3D models more easily and tap into their creativity,” adds Faez Ahmed, associate professor of mechanical engineering at MIT.
Ahmed and Nehme, together with graduate student Brandon Man and postdoc Ferdous Alam, will present their work on the Conference on Neural Information Processing Systems (NeurIPS) in December.
Click by click
The team’s latest work expands on recent developments in AI-driven user interface (UI) agents — tools which might be trained to make use of software programs to perform tasks, similar to mechanically gathering information online and organizing it in an Excel spreadsheet. Ahmed’s group wondered whether such UI agents might be designed to make use of CAD, which encompasses many more features and functions, and involves much more complicated tasks than the typical UI agent can handle.
Of their latest work, the team aimed to design an AI-driven UI agent that takes the reins of the CAD program to create a 3D version of a 2D sketch, click by click. To achieve this, the team first looked to an existing dataset of objects that were designed in CAD by humans. Each object within the dataset includes the sequence of high-level design commands, similar to “sketch line,” “circle,” and “extrude,” that were used to construct the ultimate object.
Nonetheless, the team realized that these high-level commands alone weren’t enough to coach an AI agent to truly use CAD software. An actual agent must also understand the main points behind each motion. As an illustration: Which sketch region should it select? When should it zoom in? And what a part of a sketch should it extrude? To bridge this gap, the researchers developed a system to translate high-level commands into user-interface interactions.
“For instance, let’s say we drew a sketch by drawing a line from point 1 to point 2,” Nehme says. “We translated those high-level actions to user-interface actions, meaning we are saying, go from this pixel location, click, after which move to a second pixel location, and click on, while having the ‘line’ operation chosen.”
In the long run, the team generated over 41,000 videos of human-designed CAD objects, each of which is described in real-time when it comes to the precise clicks, mouse-drags, and other keyboard actions that the human originally carried out. They then fed all this data right into a model they developed to learn connections between UI actions and CAD object generation.
Once trained on this dataset, which they dub VideoCAD, the brand new AI model could take a 2D sketch as input and directly control the CAD software, clicking, dragging, and choosing tools to construct the complete 3D shape. The objects ranged in complexity from easy brackets to more complicated house designs. The team is training the model on more complex shapes and envisions that each the model and the dataset could in the future enable CAD co-pilots for designers in a big selection of fields.
“VideoCAD is a worthwhile first step toward AI assistants that help onboard latest users and automate the repetitive modeling work that follows familiar patterns,” says Mehdi Ataei, who was not involved within the study, and is a senior research scientist at Autodesk Research, which develops latest design software tools. “That is an early foundation, and I could be excited to see successors that span multiple CAD systems, richer operations like assemblies and constraints, and more realistic, messy human workflows.”
