FPGA CPU News of September 2000

Home

Oct >>
<< Aug


News Index
2002
  Jan Feb Mar
  Apr May Jun
  Jul Aug Sep
2001
  Jan Feb Mar
  Apr May Jun
  Jul Aug Sep
  Oct Nov Dec
2000
  Apr Aug Sep
  Oct Nov Dec

Links
Fpga-cpu List
Usenet Posts
Site News
Papers
Teaching
Resources
Glossary
Gray Research

GR CPUs

XSOC
  Launch Mail
  Circuit Cellar
  LICENSE
  README
  XSOC News
  XSOC Talk
  Issues
  xr16

XSOC 2.0
  XSOC2 Log

CNets
  CNets Log
Google SiteSearch

Friday, September 29, 2000
Remembering Pierre Elliott Trudeau, 1919-2000.

I recall a New Year's Eve speech at Nathan Philips Square, bitterly cold, being unexpectedly and deeply moved by this immensely charismatic man and his vision for Canada.

[updated 00/08/04] Justin Trudeau's eulogy for his father (RealMedia).

Wednesday, September 27, 2000
Here at ESC, one station in the Lineo booth is demonstrating uCLinux running on a LEON SPARC implemented in a Xilinx Virtex-800 in an XESS XSV-800.

Lineo announces Linux on FPGA Cores.

To my knowledge, this marks the first time that any flavor of Linux has run on a monolithic FPGA CPU. A milestone. From Why FPGA CPUs?,

"a number of these designs will be published under GPL or put in the public domain. There will be communities of users of certain free CPU designs, similar to the open software movement. There will be GCC tools chains, lunatic fringe Linux ports, etc."
Congratulations to Jiri Gaisler (ESA) and Jeff Dionne (Lineo) and co.! By the way, sorry about that friendly perjorative lunatic fringe -- this work is certainly very relevant to mainstream embedded systems developers.

Monday, September 25, 2000
This week I'll be at Embedded Systems Conference 2000. On Thursday, Sept. 28, 10:30-12:00, I will join Tom Cantrell in presenting class #524, "Roll Your Own RISC". See you there.

This being ESC week, expect lots of FPGA CPU and SoC news.

Xilinx Virtex-II 3.125 Gbps serial links
Xilinx Establishes Leadership Direction for High-Bandwidth I/O Technologies. "On tap are LDT, POS-PHY4, 3.125Gbit, Infiniband, XAUI, RapidIO, and Fibre Channel for FPGAs; Virtex-II architecture to break 10 Gb/sec barrier next year." Plus support for gig and 10-gig ethernet.

'"The integration of very high bandwidth interconnects and the previously announced IBM PowerPC processor core within our Virtex-II architecture will offer customers the most flexible, fastest time-to-market development platform in the industry," said Dennis Segers, senior vice president and general manager of the Xilinx Advanced Products Group.'
Xilinx and Conexant Announce Licensing Agreement of SkyRail 3.125 Gbps Serial Transceiver Technology.
'"In creating our next-generation Virtex-II FPGAs, we felt it was critical to include high-speed serial interconnect technology, as leading system architects are increasingly employing gigabit serial interconnect technologies as the means to connect high-bandwidth elements within their systems. ... " said Erich Goetting, vice president of Product Development for the Xilinx Advanced Products Group.'
Murray Disman, ChipCenter: Xilinx Releases High-Speed I/O Plans.

Altera Excalibur hard processor cores
Altera Showcases Excalibur Embedded Processor Solutions at Embedded Systems Conference.
Altera Unveils Technical Details of Its ARM- and MIPS-Based Excalibur Product Offerings.
Altera Extends Excalibur Design Flow to Include Industry-Standard Processor Cores.

Altera reveals the hard core aspects of their SoPC strategy. According to the press releases, the new XA family is based on the ARM922T core, and provides an MMU, 8 KB I- and D-caches, and the Thumb small footprint instruction set extensions. The XM family is based on the MIPS32 4Kc, with 16 KB I- and D-caches, and a multiply/divide unit. Both familiies run at 200 MHz. Both share a "stripe" of hard cores of memory interfaces (SDRAM, SRAM, FLASH), embedded memory, and peripherals (UART, counters/timers, interrupt controller, trace/debug, and PLLs). Both use AMBA AHB bus interfaces to soft cores implemented in programmable logic.

Now the on-chip bus landscape comes into focus. Altera with AMBA, Xilinx with CoreConnect (whitepaper). Of course, differing processor architectures + differing on-chip buses = customer lock-in.

Murray Disman, ChipCenter: Altera Unveils Excalibur Details.
Craig Matsumoto, EE Times: Altera, Xilinx hop diverging buses in SoC plans.
Jayant Mathew, EDN: Altera Unveils Excalibur Technical Details.
Richard Goering, EE Times: Verisity, ARM employ Amba bus to tackle SoC verification.

Bryan Hoyer and Martin Won of Altera, in Embedded Systems Development 9/00: Marrying Processors and Programmable Logic. (PDF).

Here are some other interesting FPGA SoC articles by Martin Won:
ChipCenter (11/99): The Future of System Design: Configurable Cores and CPLDs.
ChipCenter (7/00): Programmable Logic and the Challenges of System-on-a-Chip Design.
EDN (8/17/00): New debug strategies combine on-chip and off-chip tools.

Triscend
Tom Williams, Embedded Systems Development: [Triscend A7] SoC Houses ARM Core and PLD Matrix.

Sunday, September 24, 2000
Xilinx Student Edition 2.1i is shipping. ISBN: 0-13-028907-8. FatBrain has it " in stock" as does XESS, whereas Amazon.com says "4-6 weeks".

The significantly-cheaper $55 Student Ed. 2.1i appears to target the XCV50 (and therefore the bit-compatible XC2S50 (Spartan-II-50), about $13 q1), making Virtex development much more accessible to budget conscious designers. And it's perfectly adequate for developing and testing reusable Virtex IP cores, including processors, for subsequent integration into larger Virtex/E/II designs (which require the professional tools).

We're in for lots of fun -- and some astonishing student projects. See Teaching. Hats off to Xilinx and Prentice-Hall. Thanks to products like this, I think we will witness more new processor, signal processing, and SoC designs in the next two years than in the whole history of computing.

Alas, unlike the 1.3 and 1.5 editions, the 2.1i edition does not seem to bundle Vanden Bout's Practical Xilinx Designer Lab Book. That book (together with an XS40 board and XSTOOLS) is a great, Xilinx tools-focused, hands-on introduction to digital design with FPGAs -- which is why XSOC targets that platform. XESS:

"Prentice Hall has lowered the price of the XSE-2.1i package, and we have passed the price reduction on to you. But Prentice Hall no longer includes The Practical Xilinx Designer Lab Book in the package! To correct this shortcoming, XESS Corp. will make a sequence of tutorials and labs available online in November."

Friday, September 22, 2000
We welcome autumn with a wee little bit of site tweaking.

Anthony Cataldo, EE Times: Embedded CPUs break out baseband functions for 3G apps and Intel proposes modular design for XScale systems. Marketing the concept of an additional general purpose "application processor" to run alongside the radio/comms DSP(s) in 3G phones.

"The applications and client should be developed separately from the communications stack. We want to free them so that they can develop at their own pace."

Perhaps the most fascinating spectator sport of the next two years will be watching and handicapping the many, many big companies (software, hardware, telcos, etc.) scheming and jockeying for a position to influence or control the all important emerging 3G wireless phone computing platform.

Richard Goering, EE Times: Xilinx acquires formal verification tools.

"Xilinx plans to use the technology with the 10-million gate Virtex II FPGA architecture. The company will work with EDA vendors to incorporate the technology into high-level FPGA design flows."
I don't understand. I thought the purpose of formal equivalence checking tools is to prove (sans simulation) that your specification model matches your implementation, in flows where the implementation is not mechanically derived from the specification model. But in FPGA designs where the flow is HDL --> (synthesis) --> EDIF --> (PAR) --> config bits, how would equivalence checking help? It seems to me that only in the case where there is a failure in the synthesis or place-and-route tools would it uncover a discrepancy. Can someone explain?

[update 00/09/24] Xilinx press release. Perhaps its external use has more to do with future FPGA/hard core hybrids.

Thursday, September 21, 2000
Green Mountain Computing Systems releases the GM HC11, a free HC11 CPU core. This is the third FPGA CPU core announcement this week.
"The GM HC11 CPU Core package includes the synthesizable core, projects, self-checking testbenches and a debugger."

"We have synthesised the CPU core for both Xilinx and Altera FPGAs using FPGA Express from Synopsys. The design used 1076 slices and runs at 31MHz on the Xilinx Virtex 400E part. On the Altera APEX 20K100 part, the design ran at 32MHz and used 2142 slices."

statistics
There must be considerable interest in FPGA CPU and SoC development, for in the past six months fpgacpu.org has received 62,000 page views from 9,000 unique visitors. (Even discounting spider robot visitors and repeat dial-up visitors, that's a lot of interest.) There have been 650 downloads of the latest beta of XSOC, and there are currently 110 subscribers to the fpga-cpu mailing list.

Thank you for visiting. "Let's have fun."

Wednesday, September 20, 2000
Brian Davis announces his YARD-1A (Yet Another RISC Design) FPGA processor. Sixteen 16- or 32-bit registers, 16-bit instructions (two operands), two stage pipeline, single cycle execution, one branch delay slot, hardware return stack. Written in VHDL, it "push-button" synthesizes under Synplify to run at 40 MHz in an XC2S100-5. Tested on an Insight Spartan-II demo board.

Xilinx Virtex Power Estimate Worksheet. Cool... :-)

Speaking of power estimation, yesterday I wrote "total energy per cycle" but Philip Freidin points out that a better figure of merit for energy consumption is "Joules per task" or perhaps Joules per standard benchmark. I agree. This is much better than quoting some apples-to-oranges number like mW/MHz, since one clock cycle on one implementation may accomplish quite a bit more work than one cycle on another implementation.

Tuesday, September 19, 2000
From the fpga-cpu list: on comparing and benchmarking FPGA CPU cores.

A theme of my work: for FPGA CPU cores, simple is beautiful. In my experience,

  • simpler is smaller
    • smaller is cheaper
    • smaller is faster
    • smaller is more power frugal
  • simpler is easier to test

Simpler is smaller: The simpler the processor design, the fewer gates and wires are required to implement it. Programmable interconnect, even local wiring, is usually slower than the logic it connects, and 2-input multiplexers are as expensive as register files and adders.

An easy way to gauge the complexity of a processor design is to simply count the multiplexers in the datapath.

Another way is to count the number of lines in its HDL model. One can write a perfectly adequate RISC processor (for integer C code) in a couple of hundred lines of Verilog.

Smaller is cheaper: A three-stage (fetch, decode, execute) pipelined RISC CPU really needs only 300-400 logic cells (existence proof: xr16).
A small core allows the SoC designer to ship an SoC in a smaller, cheaper part. A small core fills less than half of a XCS20XL or an XC2S30, leaving half free for the rest of your system design. A small core also opens up the delicious possibility of building a 4- or 8-way multiprocessor SoC in a Spartan2.
Smaller is faster: In the slow-interconnect FPGA world, the fewer columns of logic that a pipeline clock enable or similar signal must traverse, the "closer" the I-cache to the datapath, the faster the minimum cycle time.
An FPGA-optimized RISC processor core should target a minimum cycle time of approximately the execution stage recurrence cycle time (e.g. operand-register clock-to-out delay plus adder delay plus result mux delay plus forwarding mux delay plus operand-register setup time), which is less than 15 ns in a slow Virtex device.

A small design is easier to optimize via retiming, floorplanning, and explicit technology mapping of critical paths.

Smaller is more power frugal: The fewer wires' capacitance you charge and discharge each cycle, the less total energy per cycle (proportional to CV^2) you waste.
A smaller design, in a smaller device, will require less power.
[updated 03/18/01:]
Simpler is easier to test:A simpler design will exhibit fewer cases and paths; therefore the test bench required to elicit these situations will be smaller, simpler, and easier to achieve full test coverage.

Monday, September 18, 2000
Cary Snyder, Microprocessor Report: FPGA Processor Cores Get Serious ({09/18/00-01}). A survey article, most detail on Altera Nios.
"One may have to look to find economy FPGA devices in a price range suitable for embedded-processor use. ... Size and performance limitations may, for now, restrict their use to the simpler soft embedded-processor cores like Altera's Nios or to smaller designs like the xr16/32, which targets Xilinx devices (see http://www.fpgacpu.org/xsoc/xr16.html).
Here (once again) is my secret sauce: in the fast-gate and slow-and-meager-interconnect world of FPGAs simpler is smaller and smaller is often faster. See also Altera, Xilinx Announce.

For you FPGA newbies, Ingo Cyliax of Circuit Cellar and Derivation Systems has written a series of introductory articles over at Circuit Cellar Online: The FPGA Tour The Foundation Environment Design Downloading and Debugging Charming Adders Virtex Proto Board Multiplier Tricks.

Saturday, September 16, 2000
Craig Matsumoto, EE Times: Startup puts a fresh spin on programmable cores. eASIC is building fast dense programmable logic technology for ASICs. The logic cells are programmable via lookup tables and the interconnect is configurable via "jumper" connections in the top metal layer.

Today's Scripting News:

"Companies. Why does the software conversation always revolve around companies? What other artform forces its artists to become CEOs to be taken seriously (and even then not). Stop everything and read this section of a very early DaveNet piece, written long before anyone had heard of open source. This has been bothering me ever since I got started in the software business. A huge disconnect. I don't want to work for a company. I like making software. Now what?"

Why Companies. Classic DaveNet. "It's always been frustrating to me to have my products evaluated based on the size of the company it comes from." On the other hand, you have to marvel at the army Altera marshaled and mobilized for the Nios launch: marketing, PR, development, testing, documentation, legal, business development, FAEs, distributors, reps, seminars, trade show presence, Cygnus support, development board design and fabrication, kit fulfilment, billing, tech support, etc.

Friday, September 15, 2000
Jiri Gaisler's LEON SPARC implementation is now running at 25 MHz in 80% of an XCV300-4.

Rob Finch announces the Sparrow FPGA CPU, with 24-bit instructions and data, and an on-chip MMU and I-cache.

We also have a new 24-bit instruction word 16/32-bit RISC core in the works.

Markus Levy, EDN: Processors drive (or dive) into programmable-logic devices. "Will more and more processor core-based systems turn to PLDs? Quicker time to market and design flexibility, along with better process technology to lower prices and increase density would indicate that they will. Nevertheless, it will be interesting to see whether this trend toward PLD-friendly processors can meet the increasingly difficult demands of embedded systems." Alas, no mention of us.

Craig Matsumoto, EE Times: Actel aims anti-fuse FPGAs at handheld appliances. "Actel ... found its groove in MP3 players and has sold chips into 90 percent of the players on the market ..."

Murray Disman, ChipCenter: Actel Targets e-Appliance Consumer Applications.

Alexander Wolfe, EE Times: Startup rekindles Java core race with Moon IP. "Vulcan ASIC Ltd. ... unveils a Java hardware core called Moon. ... Vulcan built Moon as a member of Altera's consultants alliance program, implementing first silicon on Altera's Apex 20K technology."

working on many fronts
I attended an Altera Nios Hands-On Design Workshop, and I have a report that I'll polish up and post here sooner or later.

The evening following the seminar, I designed a new even simpler 16/32-bit RISC core for students, very similar to the xr16, but with an integrated I-cache memory and no pipelining. Static timing analysis says it should run at about 40 MHz in a slow Virtex. Less than 200 lines of Verilog.

I've been implementing test designs on my XESS XSV-300, including an 1152x864 color frame buffer.

I've been back to work on the xr32 test suite and tools and finished the new "100% synchronous interface" async SRAM memory controller.

I also polished CNets to the point that it could emit EDIF that the Xilinx tools could place and route. Then I went back and redesigned everything for wide buses and registers, so at the moment it's all "taken apart" and non-functional again.

I also learned the Python programming language. It is simple, clean, and you get good results fast. I may rewrite the xr assembler in Python for fun.

FPGA CPU News, Vol. 1, No. 3
Back issues: Aug, Apr.
Opinions expressed herein are those of Jan Gray, President, Gray Research LLC.


Copyright © 2000-2002, Gray Research LLC. All rights reserved.
Last updated: Mar 18 2001