Quicklink | Search

MSC - RAMTRON NEWSLETTER

Ramtron

Squeezing more from an 8051 architecture*

A performance comparison between a standard 8051 MCU and Ramtron's VersaMix VMX51C1020 and the Versa VRS51L2xx
ramtron_VersaMix

For many designers, the ability to squeeze greater performance and flexibility out of their by existing 8051-based systems simply by upgrading the microcontroller and implementing minor design changes rather than getting involved with costly investments in new architectures, code and development environments, might be somewhat of a godsend. This was precisely the development objective behind Ramtron's most recent additions to its Versa family of high performance 8051-based microcontrollers - the previewed at last month's Embedded Systems Conference.

Both the VMX51C1020 and VRS51L2xxx microcontrollers are high performance 8051-based devices with a high level of integration.

The VMX51C1020 is a single-chip, mixed-signal microcontroller solution for a diverse range of signal conditioning, data acquisition, processing and control applications in the industrial, medical, consumer, instrumentation and automotive markets. Its broad set of digital and analogue peripherals provides a competitive advantage by, minimising board size and assembly costs.

ramtron_VRS51L2xxx

The VRS51L2xxx is the first member of a new family of advanced Versa 8051 devices that bring integration and performance to even higher levels. Based on a state-of-the-art CMOS process, the VRS51L2xxx can operate at 40MHz and can achieve up to 40 MIPS of processing power. The device performs up to 12 times faster than standard 8051s, muscling into 16-bit MCU. As it is not based on a pipelined architecture, it eliminates latency typically caused by pipeline processors when they jump to different sections of code. In addition, its comprehensive set of highly configurable digital peripherals eases the load on the processor.

Unlike many 8051 devices that require 12, 6 or in some case 4 oscillator cycles per system clock cycle, the clock system of the VMX51C1020 and the VRS51L2xxx is directly connected to the device's oscillator, ensuring that one oscillator cycle translates to one system clock cycle.

Performance comparisons

To demonstrate the performance advantages achieved by the VMX51C1020 and VRS51L2xxx over the standard 8051 architecture, a simple comparison was undertaken. Of course, there are many ways to compare processor performance and every method is likely to provide different results. For the purpose of this article, a 16 Taps FIR filter computation loop including data shifting operation was used as the comparison basis, a demanding operation that is normally reserved for advanced processors and DSPs.

The VMX51C1020 and VRS51L2xxx microcontrollers include an enhanced hardware arithmetic unit allowing tremendous performance gain for DSP operations, such as dynamic FIR filtering, typically used in applications that require noise reduction and digital filtering.

Three sets of test programs were developed for these performance comparison tests and have been written using the freeware SDCC C-compiler which presents a fairly good performance in terms of output code density.

Because of architecture and SFR register structure differences that exist between the standard 8051, the VMX51C1020 and the VRS51L2xxx, different versions of the test programs were written to accommodate the three devices. Care was taken to keep the same code structure for each device in order to have a valid basis for comparison.

Absolute processing power comparison

The first test program, implementing the 16 Taps FIR computation using C instructions only, was chosen to compare the raw processing power of the VMX51C1020 and VRS51L2xxx to a standard 8051 MCU. An I/O port was used to monitor the duration FIR Loop and data shifting computation. The FIR loop calculation was performed on 12-bit data inputs handled into an integer data type (16-bit) and the output is based on a long (32-bit) variable. In order to speed up the FIR loop computation, the coefficient was copied from the Flash memory to the internal RAM. The FIR Filter coefficient was not included in the FIR processing loop as it was only done once.

ramtron_Fig1

Figure 1 shows a comparison of the processing power of the VMX51C1020 and VRS51L2xxx compared to the processing power of a standard 8051, when all the devices operate at their maximum speed. The standard 8051 being the comparison basis, a factor of 1 has been assigned to it.

In Figure 1, it is important to note that the operating frequency of the VMX51C1020 is set to 14.75MHz while the operating frequencies of both the Standard 8051 and the VRS51L2xxx have been set to 40MHz. Also note that the maximum operating frequency of many standard 8051 devices is limited to 24MHz and 33MHz instead of 40MHz when operating in X1 mode. Furthermore, many 8051 derivatives on the market provide an X2 mode where 6 oscillator cycles are required per system cycle instead of 12 in X1 mode. However, for some of these devices, the maximum operating frequency is significantly reduced compared to X1 mode. The Versa MCU processors are Ramtron's drop-in replacement to standard 8051 devices and most of them can operate up to 40MHz and do not have an X2 mode.

Figure 2 provides a relative processing power comparison for an operating frequency of 14.75MHz which corresponds to the maximum operating speed of the VMX51C1020 device. Again, the standard 8051 being the comparison base, a factor of 1 has been assigned.

For a given oscillator frequency, the single cycle operation of the VMX51C1020 and the VRS51L2xxx processor core make them 7 to 8 times more powerful than a standard 8051.

Impact of the processing power on FIR loop computation frequency

In many applications, the ability to perform digital filtering on acquired data constitutes an advantage as the digital filtering is based on software and can be adapted to various situations without requiring any hardware changes to adapt the filter characteristic to the system condition. It can also help to simplify the application's PCB and, therefore, lower costs

ramtron_Fig2
ramtron_Fig3

Figure 3 gives a comparison of the maximum acquisition frequency a system based on a standard 8051, a VMX51C1020 or a VRS51L2xxx could sustain while performing a 16 Taps FIR filter operation.

As demonstrated, standard 8051 devices have hardly enough processing power to perform operations such as FIR filtering. The histogram figures provided do not take into account the data acquisition process, so actual numbers are likely to be lower especially in relation to the standard 8051 if a serial type A/D converter is used.

The VMX51C1020 integrates a 7 channel on-chip ADC and an acquisition module that takes care of the entire acquisition process. Also, both the VMX51C1020 and the VRS51L2xxx provide an enhanced SPI interface that greatly reduces the payload on the processor if an external serial A/D converter is used to perform the data acquisition.

Just based on raw processing power, both the VMX51C1020 and the VRS51L2xxx can sustain data acquisition and digital filtering in the kilo-Hertz range. For sensor applications this facilitates over sampling of the data and simplification of the ADC analog filter front end.

The Enhanced Hardware Arithmetic Unit

The VMX51C1020 and the VRS51L2xxx devices integrate an Enhanced Hardware Arithmetic Unit which is able to perform 16-bit multiplication, 32-bit additions and includes a 3-bit accumulator as well as a 32-bit Barrel Shifter. All of these operate within one system clock cycle. Mor over, the VRS51L2xxx Arithmetic Unit can perform 16-bit divisions in 5 system clock cycles.

The hardware based Enhanced Arithmetic Unit integrated to the VMX51C1020 and the b provides a tremendous performance gain, making it possible to perform DSP operations that would normally require a DSP processor.

To demonstrate the benefit of using the Enhanced Arithmetic Unit of the VMX51C1020 and the VRS51L2xxx the 16 Tap FIR Filter program was adapted to take advantage of the Arithmetic Unit.

For the VRS51L2xxx sections were written in assembler in a more optimised version to fully take advantage of the Enhanced Arithmetic Unit. The extremely high performance gain provided by the Arithmetic unit is clearly demonstrated in Figure 4. In where the operating frequency of the Standard 8051 and the VRS51L2xxx is 40MHz and the operating frequency of the VMX51C1020 device is set to 14.75MHz.

The blue columns show the maximum frequency at which a 16 Taps FIR loop could be executed when only relying on a device processor. The red columns show the performances achieved when using the Arithmetic Unit on the VMX51C1020 and the VRS51L2xxx but without using in-line assembler instructions in the FIR computation and data shifting loops. Finally, the green column shows the performance achieved on the VRS51L2xxx when in-line assembler instructions are integrated into the FIR computation and data shifting loops. Even better performances could be achieved by coding the data processing function in assembler only. The ability to sustain about a 34kHz FIR computation rate makes it possible to perform audio processing on the VRS51L2xxx.

ramtron_Fig4

* By Francois Turgeon, Application Engineer, Ramtron International Corporation

For further information don't hesitate to contact us: Ramtron@msc-ge.com

Contacts