Products
Solutions
Learn
Training

Self-Paced Online Courses

Instructor-Led Training

MathWorks Certification Program

Events

MATLAB and Simulink Events

Event Proceedings

On-Demand Webinars

Learning Resources

Teach with MATLAB

Research with MATLAB

Student Programs

Books

Contact Us

Visit the Help Center to explore product documentation, engage with community forums, check release notes, and more.

MATLAB and Simulink Videos

Learn about products, watch demonstrations, and explore what's new.

Explore videos
Company
Company

About MathWorks

Mission and Values

Social Mission

Decarbonizing MathWorks

Customer Stories

Careers

Careers Overview

Job Search

Teams and Roles

Contact Us

Decarbonizing MathWorks

See how MathWorks is protecting and restoring Earth's resources.

Learn more
Help Center
Get MATLAB MATLAB
Sign In
Get MATLAB MATLAB Contact Us
Search

Description

Reinforcement Learning for Developing Field-Oriented Control

Use reinforcement learning and the DDPG algorithm for field-oriented control of a Permanent Magnet Synchronous Motor. This demonstration replaces two PI controllers with a reinforcement learning agent in the inner loop of the standard field-oriented control architecture and shows how to set up and train an agent using the reinforcement learning workflow.

Published: 23 Apr 2020

Full Transcript

In this video, we show how to use reinforcement learning for field oriented control of a permanent magnet synchronous motor.

 To showcase this, we start with an example that uses the typical field oriented control architecture, where the outer loop controller is responsible for speed control; whereas the inner loop PI controllers are responsible for controlling the d-axis and q-axis currents. 

We then create and validate a reinforcement learning agent that replaces the inner loop controllers of this architecture.

The use of RL agent is especially beneficial when the system is nonlinear, in which case we can train a single RL agent instead of tuning PI controllers at multiple operating conditions.

In this example, we use a linear motor model to showcase the workflow of field oriented control using reinforcement learning, and this workflow remains the same for a complex nonlinear motor as well.

Let’s look at the Simulink model that implements the field oriented control architecture.  

This model contains two control loops: an outer speed loop and an inner current loop.

The outer loop is implemented in the ‘Speed Control’ subsystem, and it contains a PI controller that is responsible for generating reference currents for the inner loop.

The inner loop is implemented in the ‘Current Control’ subsystem and contains two PI controllers to determine reference voltages in the dq frame.

The reference voltage is then used to generate appropriate PWM signals that control the semiconductor switches of the inverter, which then drives the permanent magnet synchronous motor to achieve desired torque and flux.

Let’s go ahead and run the Simulink model.  

 We can see that the tracking performance of the controllers are good and are able to track the desired speed.

Let’s save this result for later comparison with the reinforcement learning controller.

Now we update the existing model by replacing the two PI controllers in the current loop with a Reinforcement Learning Agent block.  

In this example we use DDPG as the reinforcement learning algorithm, which trains an actor and a critic simultaneously to learn an optimal policy that maximizes long-term reward.

Once the Simulink model is updated with the reinforcement learning block, we then follow the reinforcement learning workflow to setup, train, and simulate the controller.

Reinforcement learning workflow is as follows:

First step is to create an environment. In this example, we already have a Simulink model that contains the permanent magnet synchronous motor modeled using Motor Control Blockset and Simscape Electrical within the ‘Plant and Inverter’ subsystem.

We then use this Simulink model to create a reinforcement learning environment interface with appropriate observations and actions.

Here the observations to the reinforcement learning block are error in the stator currents ‘id error’ and ‘iq error’ and the stator currents ‘id’ and ‘iq’.

Actions are the stator voltages ‘vd’ and ‘vq’.

Next we create the reward signal to let the reinforcement learning agent know how good or bad the actions it selects during training are, based on its interaction with the environment.

Here we shape a reward based on the quadratic reward penalty that penalizes distance from goal and control effort.

Then we move on to creating network architecture.

Here we construct the actor and the critic networks as required by the DDPG algorithm programmatically using MATLAB functions for layers and representations.

The neural networks can also be constructed using the Deep Network Designer app and then imported into MATLAB.

The critic network in this example takes in observations and actions as the input and gives estimated Q values as the output.

The actor network, on the other hand, takes in observations as the input and gives actions as the output.

With actor and critic representations created, we can create a DDPG agent.

The sample time of the DDPG agent is configured depending on the execution requirement of the control loop.

In general, agents with smaller sample time take longer time to train as it involves a greater number of simulation steps each episode.

We are now ready to train the agent.

First, we specify the training options.

Here we specify that we want to run training for at most 2000 episodes and stop training if the average reward exceeds the provided value.

We then use the ‘train’ command to start the training process.

In general, it is best practice to randomize reference signals to the controller during the training process to obtain a more robust policy. This can be done by writing a local reset function for the environment.

During the training process, progress can be monitored in the Episode Manager.

Once the training is complete, we can simulate and verify the control policy from the trained agent.

By simulating the model with the trained agent, we see that the speed tracking performance of field oriented control is good with reinforcement learning agent controlling the stator currents.

On viewing this performance with the previously saved output, we see that performance of field oriented control with reinforcement learning agent is comparable to its PI controller counterpart.  

This concludes the video.

Related Resources

Related Products

Learn More

Reinforcement Learning Agent Deployment: Real-Time Testing on a Speedgoat Machine (4:51)

Beyond PID: Exploring Alternative Control Strategies for Field-Oriented Controllers

Related Information

Get Started with Reinforcement Learning Onramp

Featured Product

Reinforcement Learning Toolbox

Up Next:

Topics include model architecture, algorithm export, scheduling techniques, code profiling, data dictionary, and code verification using processor-in-the-loop (PIL) testing. — AC Motor Control Architecture, Code Generation, and...

Related Videos:

As a part of ensuring power system reliability through accurate system simulation, math models of generating stations are periodically recalibrated through comparison with field test data. This is not a trivial task, and the time and cost associated — Integrating Measured Data with Simulations for Automated...

Virtual engineering technology has undergone rapid progress in recent years and has been widely accepted for commercial product development. Product design and manufacturing organizations are moving from the traditional multiple and serial test cycle — Optimal Neural Network for Automotive Product Development

In this webinar, you will learn how to use simulation to design and implement multivariable controllers for a four-joint robot arm. We use two different techniques that go beyond the traditional tuning of individual PID controllers. The first method — Developing Multivariable Control Systems for Robotics

Simplify your engine control unit by merging multiple devices and using an FPGA. After an introduction to FPGAs and the Xilinx Zynq 7000 platform, Sebastian Straßl and Alexander Ehard, from Starkstrom Augsburg, demonstrate HDL code generation. — Developing a New Control Unit Using an FPGA

View more related videos