Irvine, California, April 29 2023: SimInsights is announcing SimGPT, a set of no-code generative AI-powered features to augment simulations with conversational AI.
Simulation is a decades old technology that has been applied for a range of applications as diverse as computing itself. These applications include product design, training, scientific research, analysis etc. However, simulations have never before been enriched with conversational interfaces. Specifically for training, including conversational interactions has significant benefits because it can give us virtual humans who play various roles. Nonetheless, adding quality conversation to simulations has remained difficult until now. SimGPT is designed to make it easy to add voice and chat based conversation to training simulations without any programming and AI skills. Instructional designers and subject matter experts can do it themselves.
SimGPT is the culmination of over 5 years of research and development at SimInsights. The R&D began in 2017 when deep learning based automated speech recognition and text to speech services initially became available from cloud service providers such as AWS and GCP. And with that technology, our team created conversation based training simulations and conducted research with funding from the National Science Foundation to evaluate the impact on engagement and learning . The research showed that users love conversational interactions in immersive environments.
As we transitioned this capability from research to product, we identified three specific tasks for conversational AI:
- Support scripted conversations
- Answering questions and
- Responding to commands mp
1) Scripted conversations
For the first task, the content author creates a dialogue between the learner and the virtual person using a flowchart like user interface.
Each box is called a node and the arrows emanating from the boxes are called an edge. Each node represents a dialogue state. Each edge represents a potential transition from one state to another. Transitions must be triggered, often by user input such as typed text. The text is processed by the AI system which attempts to map it to one of the edges (arrows). If the typed text can be mapped with high confidence to one of the edges, then the dialogue transitions to the next state along the edge and the simulation proceeds as dictated by the flow chart. Otherwise, the system displays the message “Please try again”. The task of mapping user input to one of the edges is known in the AI community as intent classification. It is a standard task that arises in a huge variety of applications besides soft skills training.
2) Question answering
For the second task of answering questions, we set out to build a database of questions that users may ask in training situations. We identified several categories of questions and organized them in multiple ways to guide our R&D. One important dimension that separated questions was the depth of reasoning required. We categorized questions as low-reasoning and high-reasoning. An example of low-reasoning question is “where is the multimeter?”. The system answers this question by highlighting the desired object in the scene to help the user locate it. An example of high-reasoning question is “How do I measure the current in this wire?”. To answer this question, the system must reason that current measurement requires an instrument and directs the user to it. Our research showed that low-reasoning questions can be answered with high accuracy using search techniques. However, our attempts to handle high-reasoning questions using classical methods failed to reach a high level of accuracy. Fortunately, GPT2 and GPT3 models showed promising results in early trials and we successfully integrated them into HyperSkill [cite]. With GPT4 and other Large Language Models (LLMs), we find that many high reasoning questions can be answered to a surprisingly high accuracy.
3) User commands
For the third and final task of responding to user commands, simply put, it is a variant of intent classification, namely slot extraction. For example, the command, “delete that chair” can be seen as the delete intent with a slot value of chair. This can be handled using the same methods as the first task.
Overtime, a long list of additional tasks were identified as being highly useful in training scenarios, and were added to our development roadmap. Some examples include sentiment detection, open ended chit-chat, etc. LLMs like GPT4, with their powerful generative AI capabilities, can potentially solve ALL of these tasks. SimGPT is our name for all of these capabilities in the grounded and embodied training context of HyperSkill. SimGPT feature will be included in certain licensing options of HyperSkill because every time the LLM is invoked, our company incurs a cost. For more details, please visit our pricing page.
n the coming months, SimGPT will grow more powerful as more and more tasks are supported. We also plan to continue to publish our research in conferences and peer reviewed journals [International XR conference paper , I/ITSEC 2023 paper). We especially encourage research groups at universities and other organizations to get in touch to take advantage of HyperSkill for cutting edge immersive and conversational AI powered capabilities. The no-code rapid authoring and rich dataset collection capabilities of HyperSkill mean that researchers can save months worth of effort and maybe even years when it comes to their work and publish more often.
Conversational AI is a powerful tool for simulation based learning and assessment. With SimGPT, we aim to unlock this capability for non-technical users such as instructional designers and subject matter experts.