Reimplementing Transputers |
|||
Home FPGA CPU Speeds >> << Reimplementing Alto
Usenet Postings |
Newsgroups: comp.sys.transputer,comp.arch.fpga Subject: Re: Emulating a transputer on FPGA Date: Tue, 10 Aug 1999 10:19:54 -0700 Ram Meenakshisundaram wrote in message <37B04CFB.DA77A8F0@olf.com>... >What would it take to emulate say a T225 or T425 transputer on a FPGA > (without any external links). Since I am new at this, what would be an >ideal FPGA to do this. Can I accomplish this on a XC4003E?? How many >gates would I need on the FPGA to do this. Thanks. I have thought on this before. (If you configure a big array of FPGAs as CPUs, you might as well have them talk to each other.) The answer: it depends. By "emulate" do you mean drop-in-replace? If so, I can't say, because I never studied the transputer instruction set architecture, and because it depends upon whether you are willing to handle divide, floating point, etc., in software. If you mean "build a new machine in the transputer *style*", with high integration of on-chip components and instruction set support for message sends and fast task switching, etc., then here's a few back-of-the-envelope estimates for you. I recall the inmos T414, 1985, which had a 32-bit processor, 2KB of on-chip RAM, 4 10? Mb/s links, and an integrated DRAM controller. A T414-20 had a 20 MHz clock and did about 10 million transputer instructions per second. (I think. I am not a transputer expert.) A pipelined 16-bit datapath for a two-cycled 32-bit RISC requires about 8x9=72 CLBs, and its control unit could be ~50 CLBs. A full 32-bit datapath adds another 72 CLBs. Consider the 2 KB of on-chip RAM. In an XC4000, each CLB can store 32 bits. 2KB * 8 b/B * 1 CLB/32 bits = 512 CLBs, or about half of an XC4025E. Or just 4 of the 512 B embedded RAM blocks on a Virtex part. I assume you were hoping to target an XC4000 so let's reduce our on-chip RAM requirements to 128 B (32 CLBs) and move on. An EDO DRAM controller w/ page mode support requires a 12-bit register and comparator, a 12-bit mux, and some state machine logic. Call it 20 CLBs. The four serial links (10 MHz is slow and easy) are each ~12-16 CLBs, although if you time multiplex them you may be able to make four links from one implementation plus a register file. Totals: CLBs What 72-144 CPU datapath 50 CPU control 32 128 B on-chip RAM 20 DRAM controller 20-64 4 serial links ---- 194-310 CLBs An expert might fit it in a 196 CLB XC4005E/XL, more likely you would require 1) an XC4008E or XC4010XL and 2) hand-mapping and floorplanning experience. (The XC4003E, with 10x10 CLBs, will probably prove too small. You could build a minimalist 16-bit wide datapath in about 4x9 CLBs and could build a CPU there, but you probably won't have room for links or memory controller.) If you are more interested in exactly implementing the transputer instruction set architecture, and IIRC it is a stack machine, then consider the example of the MSL16 microprocessor. See Leong, P.H.W, P.K. Tsang, and T.K. Lee, "A FPGA based Forth microprocessor", pp. 254-255, in Proc. IEEE Symp. on FPGAs for Custom Computing Machines 1998, and at http://www.cse.cuhk.edu.hk/~phwl/msl16/msl16.html, and Paul Lee Wai Lun's thesis and presentation, "FPGA Implementation of a Forth Processor", at http://home.hkstar.com/~wail/project-7260/project.htm. To implement a legacy ISA in an FPGA, I advise building a simplified implementation hidden behind a binary rewriting system! Jan Gray
Copyright © 2000, Gray Research LLC. All rights reserved. |