Goodard v. Google, Inc.

Filing 158

Download PDF
Goodard v. Google, Inc. Doc. 158 Att. 9 Dockets.Justia.com llllllllllllllIll1 1 1llllll11l111l1111l ll1 111l1111l1 1111 l l1 l l1 United States Patent de Nicolas et al. 1541 TRANSLATING A DWAMIC TRANSFER COhTROL INSTRUCIION ADDRESS IN A [75] US005 167023A 1191 [I]] 1451 Patent Number: Date of Patent: 5,167,023 Nov. 24. 1992 SIMULATED CPU PROCESSOR Inventors: Arturo M. de Nicolas; John C. o'Quin, JI', both Of Austin*Tex. 173) Assignee: [2l] [22) International Business Machines, Armonk, N.Y. Appl. No.: 625,780 Filed: Dec. 7, 1990 tion", AmigaWorld, vol. 1, No. 2, Nov./Dec., 1985, pp. 34-35. "SoftPC", Insignia Solutions, Inc., IS1 Soft PC Data Sheet Rev., 3.0, Jan. 87, 8 pages. Warner, E., "Unix-Based Workstations to Run DOS", Info World, ~ ~ 6, 1987, p. 8, 1, May, C., "Mimic: Fast System/37O Simulator", SIGA PLAN. 1987, Proceedinns of the ACM SIGPLAN '87 Symposium o n InterprGters and Interpretive Techniques, Jun. 87, pp. 1-13. Artorney, Agenr, or Firm-Wayne Primary Examiner-Kevin Smith [57f A. Kriess 1631 Related U.S. Application Data Continuarm of Ser. NO. 151,137, Feb. 1, 1988. abandoned. P. Bailey; Marilyn D. ABsTRAcr [Sl] Int.Cl,` ................. ; 364/232.3; 364/262.4; 3tW247.6; ......................... G06F 9/30 5; 395/500; [58] Field of Search 395/500, 375, 775 [561 References Cited U.S. PATENT DOCUMENTS 8/19S2 Kareds ]/I963 Foxlick .., S/1986 Fisk et al. ]/I987 Ballard .... 10/1987 Saito .................................... 4.721,484 2/1988 Saito ..... 4,841,476 6/1989 Mirchell et nl. ..................... 4,347,365 4,370,709 4,587,612 4,638.423 4,700,291 364/247; 364/247.7; 364/262.9 361/200 364/900 FOREIGN PATENT DOCUMENTS 0217068 4/1967 European Pai..Off. . 60-71. Graham, C., "Amiga's Trump Card: IBM PC Emula- OTHER PUBLICATIONS K. J. McNeley et ai., "Emulating a Complex Instruction Sei Computer with a Reduced Instruction Set Computer", IEEE Micro, Feb., 1987, vol. 7, No. 1, pp. The system and method of this invention simulates the flow of control of an application program targeted for a specific instruction set of a specific processor by utilizing a simulator running on a second processing system having a second processor with a different instruction set. The simulator reduces the number of translated instructions needed to simulate the flow of control of the first processor instructions when translating the address of the next executable instruction resulting from a dynamic transfer of control, Le., resulting from a return instruction. The simulator compares the address that is loaded at run time by the return instruction with the return address previously executed by that instruction. If the last return address matches, the location of the return is the same. If the last return does not match, a translate look-aside buffer is used to determine the address. If the translate look-aside buffer does not find the address, then a binary tree look up mechanism is used to determine the address of the next instruction aAer a return. The performance of the simulator is enhanced by utilizing the easiest approaches first in the chance that a translated instruction will result most efficiently. 7 Claims, I1 Drawing Sheets Nov. 24,1992 Sheet 1 of 11 MEMORY-420 f P 11 ROS .- I BIOS I I 113 I PROCESSOR atent Nov. 24, 1992 Sheet 2 of 11 5,167,023 START SI~ULATOR USER I N V O K E S APP I G. 2 Nov. 24, 1992 Sheet 3 of 11 9 167,023 I r 44 i 44 4 RET 113 - Be % Q, m 0 T rl 0 ' c x .4 1: 110 cu 0 U W rc 50 0 w n -.. ..I # 3 ; n Q n 3m 0 L4 LT tQ m n , z" m d 0 0 1cP rh 0 1 r- 0 T 0 c cn m 0 T 0 a .. 0 0 .. 0 0 -c 0 0 0 T I-c. .. - 0 .. 0 0 ru 0 0 0 0 N 0 0 N 0 CN 0 0 0 h i 0 0 a 0 0 N Q 0 0 0 N 0 atent Nov. 24, 1992 Sheet 5 of 11 33100 r6 4 K BLOCK: 0200 04 400 / 3 11 4 05 4 03 I I 31 L 50 400 31 -L 4 07 4 09 34- L c 401 I 408 3i-L IOD 4 00 c OIIO 31 L I Nov. 24, 1992 Sheet 6 of 11 / I 00 NILZ 130 CIS 81T BE0 CALI 6 SHLA25< Chi6 T I 2, AX #8, OXOOff TI2, 9 Chi6 ` RET 128 LH CAS LN ` c EXTS 0 SZAP C BEQ XI L BEQX .CALI 6 HEN USED ON EACH T R A ~ S L A T I OOF il 80286 ~ I UO R!, R2, Ox8OFF 2CYCtE fCYCLE 1 CYCLE LC RI,O(R!) EXTS R i , R I BNE SUBROUTINE SEGUEUT 9 Rf +ADDRESS OF STATUS BYTE R4 <a STATUS VALUE OR F CHECK IF STATUS IS ZERO OR NOR-2EAO JMP TO SUBROUTINE IF STATUS IS NO#-ZERO COPYRIGHT IM CORPORATtON 1988 B FIG, IO atent Nov. 24, 1992 Sheet 7 of 11 5,167,023 SIMULATOR IO - FIRST I ~ S T R U C T I O N S L O O K AT N E W B L O C K O F X I1 BEQ t-4,tL0,PREV10USJ TARGET P# COMPARE B Y T E REVERSE0 IP RELATIVE CONDITIO~AL~ R A IF THEY ~ A T ~ H ~~H C O P Y R I G H T l8tl C O R P O R A T I O N 1988 FIG 5 0 ur, Nov. 24, 1992 Sheet 9 of 11 I I I OF TRAff $1 T 1 ON A 175 Nov. 24, 1992 Sheet 10 of 11 5,167,023 0 f/O A D A P T E R 00000 9FFFF hob00 BFFFF f ROW) ROS I Nov. 24, 1992 Sheet 10 of 11 atent Nov. 24, 1992 Sheet 11 of 11 S, 167,023 00000 Jt6* RA A0000 j'63 VIDEO 164 codoo FOOOO F F F F F k 165 r, P6 I I I AP 204 t 95 L 2k0 214 245 216 ~0000000 FOOOO KEY FOR MEMORY CONTEIYT TYPE Y 0 000 0000 i246000 @ Bo@ t S O I l '46000i& 1 4 4 4 244444000 %'@ 6 6 6 0 0000 00 H0Ho Mo 0 @@ &d 'I I I I t N S T ~ U C T l O NEWTRY POINT 2 SUBSEQUENT BYTE O AN f INSTRUCTION 4 MERGED I N S T R U ~ T I O N 8 BREAIP0I)IT SET ON THIS iWSTRUCTIOW 16 V I D E O C CODE D DATA '9 V I D E O processing system as it would have been on the processing system for which the application was originally written. As a result, software applications which may have had a long development cycle may become obso5 lete quickly. The demise of the earlier written software This is a continuation of application Ser. No. is all the more tragic when the function of the applica07/151,137, filed Feb. 1, 1988, now abandoned. tion as originally written is still very much pertinent and in demand on the new hardware processing systems. CROSS-REFERENCES T O RELATED s As a result, there i typically only a limited amount of APPLICATIONS 10 "new" software available that is specifically written for Ser. No. 820,451, filed Jan. 17, 1986 for a VIRTUAL the "new" hardware design when the new hardware is TERMINAL SUBSYSTEM, currently co-pending, initially released into the market place. T h i s is due in and assigned to the same assignee as the present invenpaft by the long development cycle of creating software tion. application programs, and the confidentiality of the new Ser. No. 151,136, filed Feb. I , 1988 forCONDITXON 15 hardware design by the manufacturer prior to releasing CODE GRAPH ANALYSIS FOR SIMULATING A the hardware into the market place. The software manCPU PROCESSOR, now U.S. Pat. No. 4,951,195 and ufacturer has to know certain facts about the hardware assigned to the same assignee as the present invention, of a processing system before a software application which is hereby incorporated by reference. program can be written for the processing system. Ser. No. 151,123, filed Feb. 1, 1988 for a SYSTEM 20 Ideally, a manufacturer of processing systems would AND METHOD FOR SIMULATING T H E I/O OF like to have a vast amount of software available to run on the processing system as soon as the new hardware A PROCESSING SYSTEM, now abandoned, and for the processing system is announced into the market assigned to the same assignee as the present invention, place, A customer would more likely invest in a new which is hereby incorporated by reference. Ser. No. 151,135, filed Feb. 1, 1988 for a MEMORY 25 processing system if the customer knows that an abundant supply of software is already available for use. MAPPING AND SPECIAL WRITE DETECTION There have been several approaches in tapping the IN A SYSTEM AND METHOD FOR SIMULATvast amount of software that has previously been writING A CPU PROCESSOR, now abandoned, and asten for "older" hardware designs. A previous hardware signed to the same assignee as the present invention, M approach for being able to run appkations originally which is hereby incorporated by reference. written for another processor is to build the new proA portion of the Disclosure of this patent document cessing system with a coprocessor. In this way, the contains material which i s subject to copyright protecprocessing system can run applications for both types of tion. The copyright owner has no objection to the facprocessors, the new processor and the old processor. simile reproduction by anyone of the patent document or the patent disclosure, as ir appears in the Patent and 35 For example, the IBM RT PC contained an IBM PC AT coprocessor in order to use applications that were Trademark Office patent file or records, but otherwise originally written for the IBM PC AT. However, since reserves all copyright rights whatsoever. the coprocessor was supported at a low level in the BACKGROUND OF THE INVENTION operating system, the coprocessor could not take full 1. Field of the Invention 4 advantage of the functions provided by the AIX4 oper0 ating system. One of the functions provided by the AIX This invention relates to data processing systems running applications written for a specific first procesoperating system is multi-tasking as described in cosor of a first processing system, and more particularly to pending application Ser. No. 820,45t, fited Jan. 17, 1986 a system and method of simulating the first processor for a VIRTUAL TERMINAL SUBSYSTEM and for running the applications on a second processing 45 assigned to the same assignee as the present invention, system having a second dissimilar processor. which is herein incorporated by reference. 2. Description of the Related Art The coprocessor, however, limits the user to one Current advances in computer technology have lead session at a time, since the coprocessor included a hardto ever changing processors, otherwise referred to ware adapter for emulating the PC AT. In other words, herein as the central processing unit (CPU), of the pro- 50 once the coprocessor was started, no other instances of the coprocessor could be running. cessing system. Examples of the evolution of various processors are Intel's 8088 processor used in the IBM The coprocessor is also limited to the speed of the PC, Intel's 80286 processor used in the IBM PC AT1, processor of the fist processing system and cannot take Intel's 80386 processor used in the IBM Personal Sysadvantage of faster second processing systems as they tcm/22 model 80, and the IBM Research/ OPD Micro- 55 evolve. processor (ROMP) which utilizes a Reduced InstrucA second approach is to simulate the second procestion Set Computer (RISC) architecture in the IBM R T sor through software A software simulator provides a PC3.Other processors include Motorola's 68000,68020 mechanism to run previously written software for one among others. proccssor on a new processing system having a different The hardware of various processing systems changes 60 processor. A software approach to simulation allows rapidly to take advantage of the increased processing taking advantage of faster second prcccssing systems as power of emerging processors. A disadvantage of they evolve. It also allows the use of multitasking capachanging hardware is that the software written for prebilities of the operating system to provide muluple invious processors typically can not be used on the later stances of the first processor. hardware technology. In some cases where an applica- 65 Some software simulators on the market today include SoftPC, by Insignia Solutions, and the Amiga tion can be used on a different processing system other than the one it was originally written for, the perforTransformer by Simile Research Inc. for Commodore's Amiga (based on Motorola's 68000). Information on this mance of the application is not as good on the different TRANSLATING A DYNAMIC TRANSFER C O m O L INSTRUCTXOK ADDRESS IN A SIMULATED CPU PROCESSOR ' 1 5,167,023 system was published in the article "Amiga's Trump Card: EBM PC Emulation", AMIGA WORLD,Vol. I , NO. 2, NovemberfDecember 1985. Phoenix Technologies also provided a simulator to simulate the Intel processor for the Apollo machine which has a Motorola 5 68000 processor. Any specific CPU processor has a specific instruction set. When a software application program is developed for a specific CPU processor, it is compiled into object code. The object code is targeted to run on any c p u 10 that supports the specific instruction set. A simulator takes object code that was written to run on a specific instruction St% and COnVertS it 10 run on a different Processor which may have a similar or different instructiOn Set. The more the two instruction sets of the two 15 processors are different, the more difficult it is to simulate the other processor. For example, the Intel 80286 processor has a rich insU`uction set in that it provides a wide variety of instructions. Each instruction is tailored specifically for 20 a particular type of situation. Additionally, each instrucmay be able to do In the ROMP processor in the R T PC has a reduced instruction set (RISC) processor which provides fewer instructions and less function per instruction. As each 25 instruction.in the Intel 80286 may be able to do several tasks, more instructions would be required with the ROMP RISC pracessor to accomp'lish the same rssks, However, the speed of a processor can be increased by sjmpjifying the instruction f,)though in- 30 structions are required, no additional time i s consumed on complicated instructions while executing the more common and simpler tasks. created a Previous methods of software subroutine that would sirnutate the effect of a instruc- 35 tion, E~~~~ the machine being simulated needed to the run that instruction, the subroutine would be called in order to decode and execute that instruction. The problem with this approach is that the overhead of decoding the instruction occurs every time the subroutine is 40 the speed of the called and executed, simulated processor is affected, Instead of calling a subroutine each time an instruction needed to be executed, another software simulation approach compiled a shorter sequence of host machine 45 instructions to simulate an instruction. AS a result, the overhead of decoding and .translating the jmtructjon occurs only once, during the first time the instruction is . encountered. This translation is then saved. From then on, every time that instruction is simulated, the transla- x) tion is executed. This is often referred to as a second generation simulator. A first generation simulator will take an instruction one at a time and decode it in real time and execute it. The decoding is done for each instruction as each instruction is needed. A second gen- 55 eration simulator will go through the instructions one at a time, translate the instructions, and then reuse that translation instead of going back and translating again. A previous second generation simulator was the simulator that simulated the IBM ROMP CPU on the IBM 60 System/370 called RSIM. This simulator reserves a fued amount of storage for each instruction (16 bytes for every half word) called cells. IBM 370 instructions would then be generated for each one of these cells for each R T instruction. If the amount of code generated is 65 less than what would fit in one cell, which is usually the case, then it branches to the next baundary of the next cell. If the amount of code generated to simulate the 3 5,167,023 instruction can not fit into one cell, then a subroutine call is generated thar branches to a run time environment set of routines which performs the emulation and returns back to the cell to complete execution Another simulator simulates the processor of the IBM System/370 on the IBM R T PC which is described in the following article: May, C. "Mimic: A Fast System/370 Simulator", presented Jun. 11, 1987 at the Association of Computing Machinery Symposium on Interpreters and Interpretive Techniques, and published in SIGPLAN, 1987 proceedings of ACM. A first generation simulator runs 50 to 100 host machine instructions per simulated instruction, A second generation simulator an average of 10 host machine instructions per simulated instruction, If a a k a either 50 Or 10 instructions to simulate one instruction on the simulated machine, the second processor m n i n g the shulator mmt be either 50 or 10 times 8$ fast respectively as the simulated machine be comparable in wrfomance, It is therefore desirable to further reduce the of instructions per each simulated or translated instruction than what has previously been accamplished in the art, For if a coutd be designed to use only 4 instructions per simulated instruction, and the simulator processor is more than 4 times faster than the simulated machine processor, the simulator will be faster than the original machine being simulated. A user then increased wfomance by using the to run an apptication program than simulated by using the machine for which the program was originally written. Therefore, the overall problem to OVerCOme in Sh-mlating another Proc=or is to further reduce the number of simulator (host) instructions per simulated instruction in order to increase the processing speed of the simulator. SUMMARY OF THE INVENTION It i s therefore an object of this invention to reduce the average host machine instructions per simulated machine instructions. The SimUlatOr O f this invention runs applications Ongfor a different Processor by of inally software emufation. The software approach of simufation allows the flexibility of utilizing the functions of the operating system of the simuIatOr m ~ h i n e In the Pre. fen& embodklent O this invention, the Simulator runs f as an application on the AIX operating system of the RT pc. It therefore can take advantage of the multi-fig, multi-user capab es of the AIX operating system to &low multiple applications offgrnally written for the pc AT to run concunently without any change to the application itself. The method of simulation of this invention provides faster processing capability of the simulated processor than the previous methods of processor simulation by reducing the number of host machine instructions per simulated machine instruction. This was achieved by identifying key processing =cas that bad used more instructions than desired, and then by crating new methods to achieve the processing tasks through fewer instructions. To increase the CPU simulation, Le., reduce the average number of host instructions per simulated instruction, several key processing areas were identified that were currently utilizing more instructions than desired. 4 First, the processing task of maintaining and keeping the correct values of the condition codes was identified as disclosed in Ser. No. 07/151,136, filed Feb. 1, 1988 for CONDITION CORE GRAPH ANALYSIS FOR SIMULATING A CPU PROCESSOR, currently co- 5 pending, and assigned to the same assignee as the present invention, which is hereby incorporated by reference. T h e method of this invention uti1,izes a graph analysis technique previously applied to compiler technology, and applies it to CPU simulator technology to 10 dynamically determine whether condition codes will be needed by any subsequent instruction. The simufator saves sufficient information to generate those condition codes that may be needed. Otherwise, the number of translated instructions required are reduced for that 15 instruction, if it is determined from the graph analysis that the condition codes are not needed. Another area that was addressed to reduce the average number of instructions generated involves translating the addresses of instructions. The addresses of se- 20 quential instructions propose no great problem in translating since the translated address will also be the next sequential address. The difficulty arises in those instructions thar are not common when viewing a program statically, bur occur quite frequently in a dynamic mix 25 of instructions. A primary example is the return from a subroutine instruction. A processing system spends a comparatively large amount of time executing a return from subroutine. One difference with this type of instruction is that the ad- 30 dress of the next instruction to be executed cannot be determined statically by just looking at the program. The program has to be actually running in order to determine the address to which that parficular instruction will return. Most programs would probably return 35 to the same place they had returned the last time that instruction had executed, as frequently occurs in a programmed loop. It is quicker for the simulator to compare the address that is loaded at run time by the return instruction with the return address previously executed 40 by that instruction. If the last return address matches, the location of the return is the same. Only if it fails, are the more elaborate schemes of a translate look-aside buffer, and then binary trees, used as the look up mechanisms to determine the address of the next instruction 45 after a return. T h e three tier approach in determining the address of the next instruction simplifies the determination by using a compare in the case where the currently executing instruction returns to the same address a the last 50 s time it executed, by using a translate look-aside buffer if it retums.to an address that was returned to at least once before, if not immediately before, and by using binary trees as a last resort if the preceding two cases fail. The return address is used in a general way to refer to an 55 address of any instruction which transfers control by computing the next target address from a value in a r register o memory. A third prime procasing area that was identified for reducing the average number of host instructions per 60 simulated instruction was the processing task of being able to detect what happens when an instruction stores to memory as disclosed in Scr. No. 07/151,135, filed Feb. 1, 1988 for a MEMORY MAPPING AND SPECIAL WRITE DETECTION IN A SYSTEM AND 65 METHOD FOR SIMULATING A CPU PROCESSOR, currently co-pending, and assigned to the same assignee as the present invention, which is hereby incor- J c 5,167,023 porated by reference. The simulator of this invention provides a method for checking any first processor instruction which updates memory, to determine if the instruction is modifying a subsequent instruction or performing a video buf'fer update. The method of this invention reduces the number of cycles, Le, instructions, required to detect this modification. BRIEF DESCRIPTION OF THE DRAWING FIG. 1 is a block diagram showing the processing system environment of the preferred embodiment of this invention. FIG. 2 is a flow chart showing the initial steps in starting, the simulator of this invention. FIG.3A shows a graph analysis of a sample flow of control of fmt processor instructions that are to be translated by the simulator of this invention. FIG. 3B shows a second example of a flow of control of first processor instructions to be translated. FIG. 3C illustrates a graph analysis of the condition codes in the flow of control of the first processor instructions shown in FIG. 3B. FIG. 3D illustrates the flow of control of second processor instructions translated from the graph analysis of FIG. 3C. FIG. 4 is a flow chan of the translation. FIG, 5 is the program code used in the first method of the three tier approach f r determining &hetranslated o instruction address of the next executable instruction. FIG. 6 shows the data structures of the second and thud approaches for determining the translated instruction address of the next executable instruction by mapping an instruction of one instruction set to a comesponding translation address of a simulator having a different instruction set. FIG. 7 is a flow chart of the three tier approach for determining the translated instruction address of the next executable instruction. FIG.8 is a block diagram showing the t y p e and contents of the memory of a processing system. FIG. 9 illustrates the mapping of the memory of the first processing system into the memory of a second processing system, and a status table for indicating the type of contents of a store to memory. FIG. 10 is the program code used to find the corresponding byte in the status table of a memory location in either shared memory o on an adapter. r DESCRIPTION OF THE PREFERRED EMBODIMENT The preferred embodiment of the system and method of this invention simulates a processing system such as the IBM PC AT which utilues an Intel 80286 processor having a complex instruction set, on a procffsing syst m 1 as shown in FIG. 1such as the IBM RT PC which . e utilizes the ROMP processor having a reduced instruction set computer (RISc) technology. Although the RISC processor has less function per instruction, it can process instructions faster. The architectures between an late1 80286 based machine and 8 RISC based machine arc quite different from each other. The greater the differenceahere is between the architectures of two processing systems, the more difficult it is to simulate the one processor on the other processing system. For more information on the RT PC processing system, the IBM PC AT processing system, a d Intel's 80286 processor, the following references are suggested, and are hereby incorporated by reference. Bach, 6 Operating Sysrem Commands Reference, Version 2.1, K M Corporation, SCU-0790. A I X Operaring Sysrem B Managing the A I X Operaring System, Version 2.1, IBM Corporation, SC23-0793. AIX Operaring programming Took; and Interfaces, Version 2.1, 1BM CorWration, SC23-0789. A I X Operating Syslem ' Technical ReJerence, Version 2.1, Volumes 1 and 2, IBM Corporation, SC23-0808 and SC23-0809. IBM RTPersonal Cornpurer Technologv, IBM Corporation, SA23-)057, 1986. Virlual Resource Manager Technical Reference, Version 2.1, Volumes 1 and 2, IBM Cowration, SC23-0816and SC23-0817, iAPX 286 Programmer's Reference Manual Xncluding the OiPX 286 "Vmeric Supp!emenf, Intel, 210498-003,1985, and IBM PC A T Technical Reference Manual, IBM Corporation, March 1984. As shown in FIG. 1, the simulator 10 runs as an application program on the operating system l2 Of the processing system 1. Referring to FIG. 2 in addition to FIG. 1, as the simulator 10 is started, step 2 , the simulator 10 copies BIOS 13 containing 80286 instructions from read only storage (ROS)IS,otherwise referred to as read only memory (ROW, into the operating SYSshared memory segment step 3'The sirnulator 10 translates the BIOS 13, step 4, which loads the operating system (DOS) 18 for which the application 19 was originally step 5, The simulator 10 then translates and executes that operating system 18,step 6. A user invokes the application x9Bt the operating systern prompt, step 7, and the simulator 1 translates and 0 executes the application program 19,step 8. T~increase the performance of the cpu simulation, i.e., reduce the average number of host instructions per simulated instruction, several key processing areas were identified that were currently utilizing more instructions than desired. 1. Condition Code Graph Analysis First, the processing task of maintaining and keeping the correct values of the condition codes was identified. Many processor instructions affect the flags register 20 (FIG. 3 c ) by updating the condition codes 21-26 in it to reflect the rault of an operation, There are six different condition codes, the overflow flag 21, the sign flag 22, the zero flag 23, arithmetic flag 24 also known as the half carry, the parity flag 25, and the carry flag 26. These condition codes 21-26 indicate common conditions such as whether the result was zero, whether it was negative, whether a carry Out of a register oc. curred, or yhether an overflow condition resulted. It also includes conditions that indicate the parity (even or odd) of the lower byte of the result and the carry out of the lower 4 bits of the operation (half carry). To simulate a first processor instruction set by keeping an up to date version of the first processor flag register 20 would require additional cycles for every instruction that affects the register. This is especially true if the architecture of the fmt processor defines several different combinations of condition code updates. For example, the condition codes may be always set or cleared, computed, left unchanged, or left undefined. A previous simulator RSIM, the RT PC processor .simulator on IBM S/370,used a scheme to reduce the overhead of keeping track of the condition codes. Registers were reserved for this purpose. The reserved M. J. The Design of the UNIX Operaring Sysrem, Prentice Hall, 1986. Lang, 1.G. and Mothersole, T.L., Design ofthe R T PC I'RM Nucleus, Sep. 1, 1986. A I X 5 7 5,167,023 10 15 *O 25 M 35 40 45 so 55 60 65 register contained the 32 bit values involved in an operation, and the type of operation performed. The idea was to postpone the work of determining the value of the condition code until an instruction that actually requires it is simulated. Nevertheless, there is still the overhead of saving away these values and types. This overhead is unnecessary if all possible subsequent paths contain an instruction which modifies the same condition codes without any intervening instruction needing them. In order to determine which modifications of the flag register #r actually were used by subsequent instructions, the shulator 10 of this invention relies on information provided by a graph analysis 30,FIG. 3A, or 50, FIG. of a block of f i t procPssor instructions i ~ ] . me% ttchniques were previously applied in high level language compilers. However, it is believed that this is the first time that this technique has been applied to the problem of processor simulation. 10 reaches a new blwk of first When the processor instructions lM], step 131,FIG. 4, the sirnulator invokes the translator 27 to a second processor translation, step 132, The translation occurs in three phases, First,a graph 50, FIG, 3c, is built, step 133, FIG, 4, which represents the Structure of the block of first proinstructions lWf' Each !lode 1 the graph ' corresponds to One instruction lw~ A first processor with inforeach instruction decoder "lS mation about the instruction 100 including which registers and condition codes 21-26 the instruction needs, block 42, in order to execute, and which condiSets, block 43,as a tion codes 23-26 the hstruction register result ofexecuting. f t will be noted that the 42 and the set register have a =parate bit for the Overflow flag 21 and the carry flag 26p and groups the remaining condition codes 22-25 together in one bit. Therefore if any of these condition codes 22-25 are used or set by an instruction 100,the middle bit of registers s indicate. o 43 Second, step 134, FIG. 4, the graph 30 is analyzed to determine where intempts must be Polled, how 1 0 order the translations So to minimize branching, and which condition codes defined by an instruction are actually "sed, Third, step 136, FIG. 4, the code generator 29 i s b ~ k e to tmxdate the graph 3 into second Processor d 0 instructions 130, FIG. 3D. At the time of code generation, step 136, the information in the graph 50 will indicate if the condition codes 23-26 defined by an instruction are actually used. For example, in a majorily O f c~ses, condition codes the defined by a shift (SHL) instruction 125, FIG. 3B, 3C are not actually used. The code generator 29 can use this knowledge and generate a single second processor instruction 135,FIG. 3 D where four or five instructions may have bem required to save the operands of the operation if the condition codes were needed. It i Seen s in the flow of control of the instructions 100 of FIG. 3B, 3C that the condition codes 21-26 for the ADD instruction 126 were needed i a subsequent instruction 128 in n this example. Consequently, the translated instructions 130, FIG. 3D, resulted in six instructions instead of just one instruction had they not been needed. Nevertheless, should the graph analysis 50 indicate that the condition codes 21-26 are not needed, the extra translated instructions are not generated. 8 x "' 427 The simulator 10 translates first processor instrucwritten for the first processor is called a depth first search. The depth first search means that the instructions 100 into second processor instructions 130, but does not do the translating one instruction at a time. The tions are searched sequentially until an end node (Type 0); e.g., the IRET instruction 109 FIG. 3A is reached. simulator 10 examines the first instruction that is about to be executed, and for which there is no translation, 5 After a type zero node is reached, the search returns to and continues examining subsequent instructions while the last instruction that had more than one descendent. building a graph 30 of those instructions as shown in The order of instructions 100 FIG. 3A, stored in FIG. 3A. memory 120 would be compare 102,jump below 103, Each node 101 in the graph 30 corresponds to one jump if equal 104,increment 108,and interrupt return first proaxsor instruction 100.Each node 101 can have 10 109.After the interrupt return 109,the search would go at most two descendants. In the case of sequentiaf inback to the jump if equal instruction 104 and examine structions 102,105,106, 108,110, 111,112, which transthe second descendent. The second descendent, interrupt return 109 is an instruction that is already stored in fer control only to the next sequential instruction in memory, the nodes 101 have only one descendant 103, memory. Therefore, the search returns to the next pre106, 107,109,111, 112,113,respectively, as shown in 15 vious node that had two descendents, the jump if below FIG. 3A a a vertical line 114, s There can be two descen103, and follows the second path. New code is then dants in the case of conditional branch instructions 103, found in this path. The next instructions are stored in 104,which test for a condition and branch to one inmemory in order as shown in FIG.3A. struction if the condition is true, and continues with the Referring to FIG. 3B and 3C, for each of the nodes 0 next instruction if the condition is false. It is possible for 2 101,the simulator keeps two fields in memory, one is the condition codes 21-26that are needed by that ina node 101 to have no descendants, as is the case for the interrupt return instruction 109. The instruction 109 struction 100, register 42, the other is the condition that has no descendants illustrates an instruction that codes 21-26 that are set by that instruction 100, register 43.At that point a propagation process is performed to transfers control dynamically. Also, a node 101 can have one descendant that is not a sequential instruction. 2 5 optimize which condition codes must be set. This is accomplished by going from the last instruction alloThe unconditional jump instruction 107 is an example of cated to the first instruction allocated. In this order, this. As shown above, there are four types of instructions instructions are taken out of a circular queue, The regis100.These types are numbered in the code as follows. A ter 42 of a node IO1 is updated by ANDing the complinode 101 is numbered `V' if the instruction at the node 30 ment of register 43 of that node with register 42 of all has no descendents, Le., the return instruction 126 FIG. descendant nodes, and then ORing this result into the register 42 of a node 101 being updated. The updated 3C, the interrupt return instruction 109,FIG. 3A, and register 42 reflects the descendants' needs for condition the return instruction 113,FIG. 3A. A node POP i s numbered "1" if it is a sequential instruction and has one codes 21-26.If all descendants are marked done, then so sequential descendent; e.g., compare instruction 121, 35 i this node 101 being updated, and the node 101 is taken s out of the queue. If the node is not done, the instruction decrement instruction 124, FIG. 3C, compare instruction 102,increment instruction 108. FIG. 3A. A node is put at the end of the queue. The queue is gone 101 is numbered "2" if the instruction has two descenthrough until the queue is empty indicating all of the nodes have been processed, or there are some remaining dents ie., jump if below instruction 122,FIG. X, and 103,FIG. 3A. A node 101 is numbered "3" if the in- 40 nodes in the queue that nothing has changed, i.e. a node struction is not a sequential instruction, but has only one has not been updated. descendent, i.e. jump instruction 107, FIG. 3A. A second pass is performed through the graph. In this Having built a graph 50, FIG. 3C, that describes the pass, the register 43 is updated by reducing the condiblock of first processor instructions 100,FIG. 3B, one tion codes set to acknowledge those that are now 0 node 1 1 per instruction lOO, the simulator does an 45 shown as being needed by its descendants. analysis of the graph 50. Each node 101 contains inforAt the end of this analysis, each node 101 indicates mstion on which condition codes 21-26 a first processor which condition codes 21-26 must be set, register 43. instruction 100 normally needs for execution, register The number of condition codes 21-26 could be less than 42,and which condition codes 21-26are set, register 43, the number initially set by that instruction. The set of by that instruction after executing. In the case of the 50 condition codes will include only those condition codes compare instruction 121, the compare instruction 121 that will be used by subsequent instructions. For examd m not need any condition codes 21-26 to execute, but ple, the return instruction 128 indicates in register 42 it will set all of them, register 43. The jump if below that it will use all of the condition codes 21-26 as a (JB) instruction 122 needs the carry condition code 26 precautionary measure since the subsequent instructions in order to execute because that is the condition being 55 are not known until execution. The move instruction tested, register 42,but the JB instruction 122 does not 127 typically does not use, register 42, the condition Set any condition codes after executing, register 43. The codes 21-26,but indicates a use, register 42, since the jump if equal (JE) instdction needs the equal condi123 subsequent return instruction 128 needs them. The add instruction 126 does not use the condition codes 21-24, tion code bit in order to execute, but the JE instruction lZ3 does not set any condition codes after executing. 60 register 42,but sets all of them, register 43, which then The decrement instruction 124sets the overflow flag 2 1 can be used subsequently. The shift instruction 125 and the arithmetic flags 24, register 43, and leaves the normally sets, register 43, all of the condition codes carry condition code 26 unchanged, register 43. The 21-26,but does not need to in this case since the subsegraph analysis continues in this fashion for each of the quent add instruction 126 does not use them. It was rest of the instructions 100. 65 already analyzed that the add instruction 126 would set The nodes 101 are allocated in storage 120 scquenthe condition codes 21-26for subsequent use. This analtially as the application program 1 finds them. The 9 ysis for updating the use register 42 and the set register search through the application program 19 originally 43 continues in reverse order through the list of instruc- 9 5,167,023 10 The simulator 10 translates first processor instrucwritten for the first processor is called a depth first tions 100 into second processor instructions 130, but search. The depth first search means that the instructions are searched sequentially until an end node (Type does not do the translating one instruction at a time. The 0); e.g., the IRET instruction I09 FIG. 3A is reached. simulator 10 examines the firsr instruction that is about to be executed, and for which there is no translation, 5 After a t y w zero node is reached, the search returns to and continues examining subsequent instructions while the last instruction that had more than one descendent. building a graph 30 of those instructions as shown in The order of instructions 100 FIG. 3A, stored in memory 120 would be compare 102, jump below 103, FIG. 3A. Each node 101 in the graph 30 corresponds to one jump if equal 104, increment 108, and interrupt return first processor instruction 100. Each node 101 can have 10 109. After the interrupt return 109, the search would go at most two descendants. In the case of sequential inback to the jump if equal instruction 104 and examine structions 102,105, 106, 308,110, 111,112, which transthe second descendent. The second descendent, interrupt return 109 is an instruction that is already stored in fer control only to the next sequential instruction in memory, the nodes 101 have only one descendant 103, memory. Therefore, the search returns to the next pre106, 107, 109, 111, 112, 113, respectively, as shown in 15 vious node that had two descendents, the jump if below FIG. 3A as a vertical line 114. There can be two descen103, and follows the second path. New code is then dants i the case of conditional branch instructions 103, found in this path. The next instructions are stored in n XO4, which test for a condition and branch to one inmemory in order as shown in FIG. 3A. struction if the condition is true, and continues with the Referring to FIG. 38 and 3C, for each of the nodes next instruction if the condition is false. ft is possible for 20 101, the simulator keeps two fields in memory, one is a node 101 to have no descendants, as is the case for the the condition codes 21-26 that are needed by that ininterrupt return instruction 109. The instruction 109 struction 100, register 42, the other is the condition that has no descendants illustrates an instruction that codes 21-26 that are set by that instruction 100, register 43. At that point a propagation process is performed to transfers control dynamically. Also, a node 101 can have one descendant that is not a sequential instruction. 25 optimize which condition codes must be set. This is The unconditional jump instruction 107 i s an example of accomplished by going from the last instruction allocated to the first instruction allocated. In this order, this. instructions are taken out of a circular queue. The regisAs shown above, there are four types of instructions 100.These types are numbered in the code as follows. A ter 42 of a node 101 is updated by ANDing the compli0 node 101 is numbered "0" if the instruction at the node 3 ment of register 43 of that node with register 42 of all has no descendents, Le., the return instruction 128 FIG. descendant nodes, and then ORing this result into the 3C, the interrupt return instruction 109, FIG. 3A, and register 42 of a node 101 being updated. The updated the return instruction 113,FIG. 3A. A node 101 i s numregister 42 reflects the descendants' needs for condition bered "1" if it is a sequential instruction and has one codes 21-26. If all descendants are marked done, then s o sequential descendent; e.g., compare instruction 121, 35 is this node 101 being updated, and the node 101 is taken out of the queue. If the node is not done, the instruction decrement instruction 124, FIG. X, compare instruction 102, increment instruction 108, FIG. 3A. A node i put at the end of the queue. The queue is gone s 101 is numbered "2" if the instruction has two descenthrough until the queue is empty indicating all of the dents Le., jump if below instruction 122, FIG. 3C, and nodes have been processed, or there are some remaining 103, FIG. 3A. A node 101 i s numbered "3" if the in- 40 nodes in the queue that nothing has changed, i.e. a node has not been updated. struction is not a sequential instruction, but has only one descendent, i.e. jump instruction 107, FIG. 3A. A second pass is performed through the graph. In this Having built a graph 50, FIG. 3C, that describes the pass, the register 43 is updated by reducing the condiblock of first processor instructions 100, FIG. 33, one tion codes set to acknowledge those that are now node 101 per instruction 100, the simulator d w s an 45 shown as being needed by its descendants. At the end of this analysis, each node 101 indicates analysis of the graph 50.Each node 101 contains inforrnation on which condition codes 21-26 a first processor which condition codes 21-26 must be set, register 43. instruction 100 normally needs for execution, register The number of condition codes 21-26 could be less than 42, and which condition codes 21-26 are set, register 43, the number initially set by that instruction. The set of by that instruction aft= executing, In the case of the M condition codes will include only those condition codes compare instruction 121, the compare instruction 121 that will be used by subsequent instructions. For examdoes not need any condition codes 21-26 to execute, but ple, the return instruction 128 indicates in register 42 it will set all of them, regster 43. The jump if below that it will use all of the condition d e s 21-26 as a (JIB) instruction 122 needs the carry condition code 26 precautionary measure since the subsequent instructions i order to execute because that is the condition being 55 are not known until execution. The move instruction n 127 t y p i d l y does not use, register 42, the condition tested, register 42, but the JB instruction 122 does not set any condition codes after executing, register 43. The codes 21-26, but indicates a use, register 42, since the jump if equal (JE) 123 instniction needs the equal condisubsequent return instruction 128 needs them. The add instruction 326 dws not use the condition codes 21-26, tion code bit in order to execute, but the JE instruction l Z 3 dots not set any condition codes after executing. 60 register 42, but sets all of them, register 43, which then The decrement instruction 124 sets the overflow flag 21 can be used subsequently. The shift instruction 125 and the arithmetic flags 24, register 43, and leaves the normally sets, register 43, all of the condition codes carry condition code 26 unchanged, register 43. The 21-26, but does not nccd to in this case since the subsegraph analysis continues in this fashion for each of the quent add instruction 126 does not use them. It was rest of the instructions 100. 65 already analyzed that the add instruction 126 would set The nodes 101 are allocated in storage 120 sequenthe condition codes 21-26 for subsequent use. This analtially BS the application program 19 finds them. The ysis for updating the use register 42 and the set register search through the application program 19 originally (3 continues in revem order through the list of instruc- 9 5,167,023 10 5,167,023 tions 100 previously stored in a specified order to perform the graph analysis. As seen in FIG. 3D,the translation 130 required only one instruction each for the decrement instruction W and the shift instruction 125,FIG. 3C. In Contrast, the add instruction 126 that set the condition codes 21-26 for subsequent use took six instructions. Likewise, the graph analysis 30 of FIG. 3 A would also show that the compare instruction, which sets all of the condition codes, only needs two of the six condition codes in subsequent instructions, the zero bit used by the jump if equal (JE) instruction 104, and the carry bit used by the jump if below (JB) instruction 103.None of the other condition codes are needed. Furthermore, no descendent of the conditional branches shown in FIG. 3A needs the other condition codes. Therefore the simulator can translate the block of the three instructions 102,103,104 as a single unit without having to be concerned with setting the condition codes. Therefore the number of instructions that require the preservation of the corresponding condition codes are reduced. By applying the previous compiler technique of graph analysis to a processor simulator, essential knowledge is gained about the use of condition codes by subsequent instructions. The code generator 2 which 9 translates first processor instructions 100 into second processor instructions 130 uses the condition code information to produce in many cases a single second processor translated instruction. This reduces the unnecessary cycles if the flags register had been kept up to date after every instruction, and improves the performance of the simulator. Besides utilizing the results from a graph analysis to obtain fewer translated instructions to simulate the flow of control, the results from a graph analysis can be used in the process of minimization of interrupt polling as disclosed in Ser. No. 07/151,123, filed Feb. 1, 1988 for A SYSTEM AND METHOD FOR SIMULATING T H E I/O O F A PROCESSING SYSTEM, currently co-pending, and assigned to the same assignee as the present invention, which is hereby incorporated by reference. 11. Instruction Address Translation The simulator in the preferred embodiment of this invention translates the instructions of Intel's iAPX 0286 processor (first processor) used in the IBM PC AT into simulator instructions of the ROMP processor (second processor) used in the IBM R T PC, These translations are saved for reuse when the application program originally written for the simulated first processor transfers control to that same address again. For instructions that do not transfer wntrol, determining the next instruction pointer (IP) is accomplished by adding the length of the instruction to the instruction's instruction pointer (IP). A similar sequence of sequential simulator instructions are generated by the simulator. Because the instructions foUow sequentially, locating the corresponding translations is not required. Another cfass of instructions transfers control statically. That is, the new instruction pointer is computed by adding or subtracting a fixed displacement from the instruction's pointer. These are known as relative branches. The simulator generates second processor relative branches to the corresponding translation. The instruction set contains three instructions that transfer control within a code segment to a new instruction pointer which cannot be determined statically since 11 II? it is loaded from a register or memory at run time. The RETurn from subroutine instruction i s one of these. T h e speed with which this instruction is simulated affects the overall performance of the simulator, espe5 cially if a processing system spends more time executing a RETurn instruction than any other instruction. The other two instructions are the JuMP indirect and the CALL indirect (register or memory). They will be treated the same as the RETum instruction. For example, a first processor addresses instructions 10 by using two registers. The code segment register 33, FIG. 3B describes the location of a 64K block 119 of memory 120, FIG. X. T h e instruction pointer register 31, FIG. 3B,is an offset into that code segment 33. The I5 instruction pointer 31 describes at which of the 64K bytes the instruction 100 is located. The first processor addresses instructions by indexing into a code segment 33 with the instruction pointer 31,FIG. 3G. The simulator of this invention uses the data struc20 tures shown in FIG. 6 to map fmt processor instruction addresses 100 which consist of a code segment 33 and an instruction pointer 31 to the address in memory 120 of the corresponding second processor where a sequence of second processor instructions 130 perform the same 25 function. In the method of this invention, a new instruction pointer 31, and code segment 33, whose values are determined at run time, are converted into the simulator machine (second processor) address of the Correspond30 ing translation for those instructions that transfer w n tro! dynamicaIly, such as the RETurn from subroutine instruction, JuMP indirect, and the CALL indirect, or software interrupt instructions. The simulator uses a three tier approach (FIG. 7)to 35 sirnufate these three instructions that transfer control dynamically. The three tier approach is organized in succession. T h e fastest and most likely case is performed first. The second approach is used should the first approach fail. The third approach is the slowest 40 approach which guarantees a successful conversion of the instruction pointer to an address of the simulator machine, FIG. 6 illustrates the second and third approaches. Referring to FIGS. 5 and 7,the first operation per45 formed on the new instruction pointer is to compare it with the value produced by the previous execution of that instruction, step 141.If the values match, a simulator machine relative branch is performed to the conesponding address. An exclusive OR operation deterx mines if the values match, step 142, and a conditional ) branch transfers control if they do, step 143.This allows for a fast computation of the look up addrcss. Referring to FIGS. 6 and 7, should the above a p proach fail, i.e, the instruction pointer 31 is different 55 from the previous time that particular instruction had executed, a table look up is performed using the address mapping translate look-aside buffer 34 which is similar to a hardware translate look aside buffer. The wnversion from a fust processor instruction to a second pro60 cessor instruction uses the method described below. Second processor instructions 130,when translated, assume certain values of first processor registers, called attributes. Tbesc attributes can be used by the translator to generate more efficient code in certain cases,For 65 enample, if the stack alignment is even, half word instructions can be used to transfer data to and from the stack.Otherwise, two separate byte instructions must be used. T h e value of the code segment 33 and the align- 1L tions 100 previously stored in a specified order to perform the graph analysis. As seen in FIG. 3D, the translation 130 required only one instruction each for the decrement instruction 124 and the shift instruction 125, FIG.3C. In contrast, the add instruction 126 that set the condition codes 21-26 for subsequent use took six instructions. Likewise, the graph analysis 30 of FIG. 3A would aIso show that the compare instruction, which sets all of the condition c o d a , only needs two of the six condition codes in subsequent instructions, the zero bit used by the jump if equal (JE) instruction 104, and the carry bit used by the jump if below (JB) instruction 103. None of the other condition codes are needed. Furthermore, no descendent of the conditional branches shown in FIG. 3A needs the other condition codes. Therefore the simulator can translate the block of the three instructions 102, 103, 104 as a single unit without having to be concerned with setting the condition codes. Therefore the number of instructions that require the preservation of the corresponding condition codes are reduced. By applying the previous compiler technique of graph analysis to a processor simulator, essential knowledge is gained about the use of condition codes by subsequent instructions. The code generator 29 which translates first processor instructions 100 into second processor instructions 130 u r n the condition code information to produce in many cases a single second processor translated instruction. This reduces the unnecessary cycles if the flags register had been kept up to date after every instruction, and improves the performance of the simulator. Besides utilizing the results from a graph analysis to obtain fewer translated instructions to simulate the flow of control, the results from a graph analysis can be used in the process of minimization of interrupt polling as disclosed in Ser. No. 07/151,123, filed Feb. 1, 1988 for A SYSTEM AND METHOD FOR SIMULATING THE I/O OF A PROCESSING SYSTEM, currently co-pending, and assigned to the same assignee as the present invention, which is hereby incorporated by reference. 11. Instruction Address Translation The simulator in the preferred embodiment of this invention translates the instructions of Intel's iAPX 0286 processor (first processor) used in the 1BM PC AT into simulator instructions of the ROMP processor (second processor) used in the IBM RT PC. These translations are saved for reuse when the application program originally written for the simulated first processor transfers control to that same address again. For instructions that do not transfer control, determining the next instruction pointer (IP) is accomplished by adding the length of the instruction to the instruction's instruction pointer (IP). A similar sequence of sequential simulator instructions are generated by the simulator. Because the instructions follow sequentially, locating the corresponding translations is not required. Another class of instructions transfers control statically. That is, the new instruction pointer is computed by adding or subtracting a fixed displacement from the instruction's pointer. These are known as relative branches. The simulator generates second processor relative branches to the corresponding translation. The instruction set contains three instructions that transfer control within a code segment to a new instruction pointer which cannot be determined statically since 11 5,167,023 it is loaded from a register or memory at run time. The RETurn from subroutine instruction is one of these. The speed with which this instruction is simulated affects the overall performance of the simulator, especially if a processing system spends more time executing a RETurn instruction than any other instruction. The other two instructions are the JuMP indirect and the CALL indirect (register or memory). They will be treated the same as the RETum instruction. For example, a first processor addresses instructions by using two registers. The code segment register 33, FIG. 3B describes the location of a 64% block 119 of memory 120, FJG. X. The instruction pointer register 31, FIG. 3B, is an offset into that code segment 33.The instruction pointer 31 describes at which of the 61K bytes the instruction 100 is located. The fvst processor addresses instructions by indexing into a code segment 33 with the instruction pointer 31,FIG. E. The simulator of this invention uses the data structures shown in FIG.6 to map fmt processor instruction w addresses 1 1which consist of a code segment 33 and an instruction pointer 31 to the address i memory 120 of n the corresponding second processor where B sequence of second processor instructions 130 perform the same function. In the method of this invention, a new instruction pointer 31, and code segment 33, whose values are determined at run time, are converted into the simulator f machine (second processor) address o the corresponding translation for those instructions that transfer c o n trol dynamicdly, such as the RETurn from subroutine instruction, JuMP indirect, and the CALL indirect, or software interrupt instructions. The simulator uses a three tier approach (FIG. 7) to simulate these three instructions that transfer control dynamically. The three tier approach i s organized in succession. The fastest and most likely case is performed first. The second approach is used should the first approach fail. The third approach is the slowest approach which guarantees a successful conversion of the instruction pointer to an address of the simulator machine. FIG. 6 iflustrates the second and third approaches. Referring to FIGS. 5 and 7, the first operation performed on the new instruction pointer is to compare it with the value produced by the previous execution of that instruction, step 141, If the values match, a simulator machine relative branch is performed to the corresponding address. An exclusive OR operation determines if the values match, step 142, and B conditional branch transfers control if they do, step 143, T h i s allows for a fast computation of the look up address. Referring to FIGS. 6 and 7, should the above approach fail, Le, the instruction pointer 31 i s different from the previous time that particular instruction had executed, a table look up is performed using the address mapping translate look-aside buffer 34 which is similar to a hardware translate look aside buffer. The conversion from a first processor instruction to a second processor instruction uses the method described below. W n d processor instructions 130, when translated, assume certain values of first processor registers, called attributes. These attributes can be used by the translator to generate more effrcient code in certain cases. For example, if the stack aligmttent is cven, half word instructions can be used to transfer data to and from the stack. Otherwise, two separate byte instructions must be used. The value of the code segment 33 and the align- 5 io 15 20 25 30 35 4 4 45 M 55 60 65 men1 of the stack pointer 35 for a block of first processor instructions 100 are known as the attributes of the block. Both the code segment 33 and the stack pointer 35 are 16 bit fields. The attributes occur in the code block header 36, and the address mapping translate 5 look-aside buffer 34. Translations 130 of first processor instructions 100 with different attributes are kept separate. The method takes the lower thirteen bits 32 of the instruction pointer 31 and uses this as an index into the 10 table 34 which is aligned at a 64K byte boundary at a fixed virtual address, step 144. The entry contains two words. The first word contains the attributes. The first 16 bits CS145 are the value of the code segment 33,the next bit 47 contains the alignment of the stack pointer 35 I5 denoted as S1, the next bit 46 denoted as V1 is a valid bit which is zero if the entry is not valid, and one if it is a valid entry. There are some unused bits 51. The last 3 bits 48 of the 32 bit word are the upper three bits 49 of the instruction pointer 31 denoted as IP1. 2 0 Therefore, the method is to index the address m a p ping table 34 with the lower 13 bits 32 of the instruction pointer 31, step 144, and compare the first 16 bits CSl45 with the current value of the code segment 33, step 145. This indicates that the instructions are in the Same code 2s segment indicating that the previous instruction may have be& executed recently. If that matches, the lower bit 41 of the stack pointer 35 is compared with S147, to insure that the assumptions made in the translations about the alignment of the stack are not violated, step 30 146. If this matches, and V1 45 is on indicating a valid entry, and IP1 48 matches the upper 3 bits 49 of the instruction pointer 31, then there is a hit in the addressing mapping translate look-aside table. That is, the current instruction is exactly identified with an earlier 35 translated second processor instruction. The next word 52 is a 32 bit address of the second processor instructions 130 that simulate the first prpcessor instruction 100, step 175. If there is a miss, Le., anyone of the above comparisons fails, translation proceeds by accessing the 40 hash table 37 as described in the third approach below, and the address mapping translate look-aside buffer 3 4 entry is updated with the new attributes and the new branch address for future reference. In the event that the look up is successful, the XIL 45 and BEQ instructions are m d f e accordingly. The oiid value to be EXCLUSIVE ORed by XIL will contain the new instruction pointer, and the relative offset for the branch instruction will indicate the new target ad50 dress. Control is transferred to the new target. The third approach also is shown in FIGS.6 and 7. The middle six bits 38 of the code segment 33 ere XORed with the concatenation of the lower file bits 39 of the code segment 33 and the lower bit 41 of the stack pointer 35. That gives a six bit index into the code block 55 hash table 37, step 148.The code block hash table has 64 entries, Each entry 53 points either to a null which indicates there is no entry, and implies a new translation 4s needed, step 174, or contains a pointer to a code block hcader 36. 60 Each code block header 3 describes the available 6 second processor translations for a given attribute. The 6 first field 55 in the code block header 3 contains a pointer to the next code block header 56 that hashed to that same entry 53 in the code block hash table 37. The 65 next field 57 contains the attributes of the code block. The attributes include the code segment 33 in block 58 identified as CSZ, and the alignment of the stack pointer 13 5.167,023 35 in block S2 59. For simplicity, the valid bit VI 46 is repeated as V2 60.The code block headers 36 are searched for the code block header 36 that has the Same attributes as the instruction executing at run time, step 149. The next two fields 62, 63, contain the minimum and maximum Rrst processor addresses for which there are translations with these attributes. The next field 64 is a pointer to the root of a tree that describes all the code blocks 83 for which there are translations with the specific attribute. Each of the nodes of the tree,

Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.


Why Is My Information Online?