I recently completed an AI mentorship program at SharpestMinds, of which the central element was to build a project, or even better, a complete product. I choose the latter, and in this article, I write about what I built, how I built it, and what I learned along the way.

Continued from Part 1…

This is the second article of the three-part SorterBot series.

Part 1 — General project description and the Web Application
Part 2 — Controlling the Robotic Arm
Part 3 — Transfer Learning and Cloud Deployment

Source code on GitHub:

Control Panel: Django backend and React frontend, running on EC2
Inference Engine: Object Recognition with PyTorch, running on ECS
Raspberry: Python script to control the Robotic Arm
Installer: AWS CDK, GitHub Actions and a bash script to deploy the solution
LabelTools: Dataset labeling tools with Python and OpenCV

The Robotic Arm

The robot, before assembly (Photo by Author)

The robot arm arrived from AliExpress, it was without a specific brand, advertised as a DIY toy, which made it an affordable option, costing me only $118 (+$40 in tariffs).

Since gripping objects with a robot arm requires a lot of precision, which I could not possibly expect from an arm in this price category, I decided to use a magnet to move the objects to the containers. I ordered one from Grove for $11, specifically designed for Raspberry Pi.

It came with its own control electronics, so a GPIO (General Purpose Input/Output) pin could be used to turn it off and on. For the camera, I purchased a Pi NoIR Camera V2 for $45, which was also pretty easy to set up.

To run my software and control all the above devices, I bought the latest version of the Raspberry Pi, which is the Raspberry Pi 4 Model B with 4 GB of RAM. I ordered it in a package, together with an SD card, housing, heat sinks, and power adapter, for $130. The hardware cost me $344 in total.

Controlling the Arm

The arm arrived in pieces that I had to assemble myself. I thought that would not be a problem, but it was a bit more challenging than I expected. First of all, the instructions were in Chinese, but they included pictures.

I completed the first few steps without any problems, then pictures of parts started to show up in the instructions that I did not have. First I thought that my kit was not complete, but then I noticed that I had similar parts, but with different sizes, bores in different places, and so on.

I looked around at the website where the manual was, and there were a few other models, some of them had the exact same parts that I had. I tried to figure out if I might have received a different model, but unfortunately not: I had a mixture of parts from different arms. From that point, the instructions were totally useless.

Since it took almost a month for the arm to arrive, ordering a different one was not really an option, I had to work from what I had. After 8 hours of struggle, drilling bores in different places, bending the metal pieces a bit here and there, I managed to put together a functioning arm. It was not perfect but did the job well enough.

This robotic arm has 6 degrees of freedom, meaning there were 6 servos shipped with it. I needed only 4 of them, which was fortunate, because one of the servos broke immediately as I tried to turn it with my hand, and another one was simply not working.

Precisely controlling servos is not a straightforward task. When they receive a signal, they move to the desired position as fast as they can and then hold there until another signal arrives. This behavior is more suitable for applications like RC planes, where servos are used to control the plane’s fins.

On the contrary, in robotics applications, this can cause very sudden and shaky movements, but with software-based movement smoothing, this can be mitigated.

The servos are controlled with PWM, which stands for Pulse Width Modulation. PWM has a cycle time (T), which is usually 50 Hz in the case of analog servos.

This frequency means that the desired servo position is updated 50 times a second (once every 20 ms). To set the shaft angle, an electrical pulse has to be applied to the servo, its width ranging between 0.5 and 2.5 ms, which is called the duty cycle.

If a pulse width of 0.5 ms is applied, the servo will move to the most counter-clockwise position, at 1.5 ms it will move to the middle (neutral) position, and at 2.5 ms, it will move to the most clockwise position.

Usually, a servo can move 180 degrees, but it can vary depending on the model. The pulse has to be repeated in every cycle to instruct the servo to hold that position if no pulse is applied, the servo turns off and the shaft can be moved freely.

After I connected my servos to the Pi and SSH’d into it, I installed the default library for controlling the GPIO pins: RPi.GPIO. It is very easy to use, I just set up a pin as a PWM output, set a pulse width and the servo is already moving. It moves at full speed immediately and stops very abruptly.

Since a robot arm weighs much more than an RC plane’s fin, I was afraid that these abrupt movements eventually will damage the small gears in the servos, especially the bottom ones that carry the most weight.

Another problem that I came across is that RPi.GPIO provides software-timed PWM, which means that the pulses are generated by the CPU of the Pi. If the Pi is doing some other work at the same time, the pulses might be significantly delayed or even distorted.

This might not be a problem if it is used to blink a LED, but in my case, I needed all the accuracy I could squeeze out from the system. The solution was to go with another library that provides hardware-timed PWM, which is much more accurate (to a few microseconds) and independent of the CPU load. One popular library that provides this feature is PiGPIO.

Executing Commands

Each command that the arm receives from the inference engine is a pair of absolute polar coordinates. The first coordinate is always the item, the second one is the container. In order to move the item to the container, the arm has to move to the item, turn on the magnet, move above the container, then turn off the magnet.

To achieve this, two tasks had to be solved:

Move a single servo at an appropriate speed and as smooth as possible.
Coordinate 4 servos to move the arm to the desired position.

Controlling the servo the naive way, just immediately sending the target pulse width results in an unacceptably fast and sudden movement. The only way of slowing down this movement is to generate intermediate points and send them to the servo in small intervals, effectively slowing down the movement.

Naive control (left), slower, but linear trajectory (center), sine smoothing (right)

To further smooth the movement, instead of a linear profile, sine smoothing can be applied.

Linear (blue) and sine (red) trajectories

To produce the red line from the x values, the following equation can be used:

The equation to generate the sine-smoothed trajectory

This technique slows down the movement just after starting and just before stopping, which helps to achieve a smooth movement.

To move the arm to a specific location, all the servos have to move synchronized. Generally, it is faster if all of the servos move at the same time (parallel control), but in some cases, for example, immediately after picking up an object, it is better to first move the arm upwards (serial control), otherwise the magnet would bump into nearby objects.

Since moving an arm includes sending commands to a servo in small intervals, and sleep while the servo executes them, controlling multiple servos in parallel requires a multi-threaded approach.

The last part of controlling the arm is figuring out which servo angles belong to the coordinates received from the inference engine. The position received is an absolute polar coordinate: the first part is the angle of the arm’s base servo expressed in pulse width (γ), the second part is the distance between the arm’s base axis and the image’s center point expressed in pixels (d).

Since γ is expressed in pulse width, it can be directly sent to the servo. Figuring out which servo angles belong to the required distance, d is a harder task. In the optimal case, when the exact dimensions of the robot arm are known and the joints are essentially without backlash, the servo angles could be calculated using basic trigonometry.

In my case, not knowing the dimensions and also lacking proper tools to measure them, I decided to take an experimental approach: I moved the arm manually to 7 positions evenly distributed within the operating range and recorded the servo angles for each position.

Since d depends mostly on the angle of servo1, I plotted the angles of servo2 and servo3 as a function of servo1 position, corresponding to the 7 points recorded.

Plot of pulse width values of servo2 and servo3 as function of servo1, corresponding to the desired positions

As shown on the figure above, a cubic polynomial gives a pretty good fit, and yields the equations that can be used in the code to calculate positions of servo2 and servo3, given servo1.

The angle of servo1 can be calculated from arm-specific constants, namely the distance between the center of the camera’s field of view and the arm’s base axis measured in pixels, and the pulse width values of servo1 corresponding to the minimum and maximum distances in the polar coordinates. These values can be estimated by rough measurements, like using a ruler, then fine-tuned by trial and error.

Having completed all the above, the arm is able to move directly to any coordinate. The only adjustment needed is to handle drop-offs above containers, where the arm needs to stop in a higher position. Simply offsetting servo1 with a negative value can achieve that.

Thank you for reading, and if you have any questions, comments, or suggestions, please let me know!

In the third and last part, I will write about how I used transfer learning, and how I deployed the solution to AWS.