/****************************************************************************** * Running programs on GEM5 Simulator @ LaCASA Laboratory (lacasa.uah.edu) * This tutorial should help you run your own programs on gem5 simulator. * It is tested on eb245-mhealth3 machine that runs 64-bit CentOS 6.3 OS. * Author: Aleksandar Milenkovic * Date: July 2013 ******************************************************************************/ gem5 can simulate a complete system with devices and an operating system in full system mode (FS mode), or user space only programs where system services are provided directly by the simulator in syscall emulation mode (SE mode). In this example, we focus on SE mode and running programs on ARM(x86) ISAs. 1. Single-threaded programs / SE mode / ARM ISA --------------------------------------------------- We will consider a simple matrix multiplication programs matmul.c located in /opt/gem5/uah.tests directory. The program multiplies two squared matrices and write the resuling one into a file. The input parameter is matrix dimension. a) Compiling program for ARM ISA. To compile the program we will use the ARM cross compiler. We use Code Sourcery cross compiler with binaries available in the following directory. Be sure this directory is included in you path: /opt/CodeSourcery/Sourcery_CodeBench_Lite_for_ARM_GNU_Linux/bin. Go to directory with the matmul program (you will use your own directory for this step) and do the following: <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< [milenka@EB245-mhealth3 uah.tests]$ pwd /opt/gem5/uah.tests [milenka@EB245-mhealth3 uah.tests]$ arm-none-linux-gnueabi-gcc -static -O3 -o matmul.arm matmul.c [milenka@EB245-mhealth3 uah.tests]$ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Note: It is crucial that -static flag is included for statically linked libararies. gem5 in SE mode does not support dynamically linked libraries. matmul.arm file should be created. b) Running program. To run the matmul program from /opt/gem5 directory type in the following (the input parameter for matmul program is specified using --options switch): <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< [milenka@EB245-mhealth3 gem5]$ ./build/ARM/gem5.opt configs/example/se.py -c uah.tests/matmul.arm --options="16" gem5 Simulator System. http://gem5.org gem5 is copyrighted software; use the --copyright option for details. gem5 compiled Feb 13 2013 11:10:22 gem5 started Jul 9 2013 17:45:39 gem5 executing on EB245-mhealth3 command line: ./build/ARM/gem5.opt configs/example/se.py -c uah.tests/matmul.arm --options=16 Global frequency set at 1000000000000 ticks per second 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 **** REAL SIMULATION **** info: Entering event queue @ 0. Starting simulation... 0.000000 seconds. hack: be nice to actually delete the event here Exiting @ tick 749491500 because target called exit() >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The program completes in tick 7494915000. The output file is created in the directory (matrices.txt). You can examine the stats.txt file as follows: <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< [milenka@EB245-mhealth3 gem5]$ head m5out/stats.txt ---------- Begin Simulation Statistics ---------- sim_seconds 0.000749 # Number of seconds simulated sim_ticks 749491500 # Number of ticks simulated final_tick 749491500 # Number of ticks from beginning of simulation (restored from checkpoints and never reset) sim_freq 1000000000000 # Frequency of simulated ticks host_inst_rate 2718688 # Simulator instruction rate (inst/s) host_op_rate 3488858 # Simulator op (including micro ops) rate (op/s) host_tick_rate 1744399860 # Simulator tick rate (ticks/s) host_mem_usage 752156 # Number of bytes of host memory used ... >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Single-threaded programs / SE mode / x86 --------------------------------------------------- a) Compiling program for X86 ISA. To compile the program we will use the gcc on the host machine. Be sure that glibc-static library has been installed on your machine before proceeding further (yum install glibc-static). Go to the directory with the matmul program (you will use your own directory for this step) and do the following: <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< [milenka@EB245-mhealth3 uah.tests]$ pwd /opt/gem5/uah.tests [milenka@EB245-mhealth3 uah.tests]$ gcc -static -o matmul.x86 matmul.c [milenka@EB245-mhealth3 uah.tests]$ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Note: It is crucial that -static flag is included for statically linked libararies. gem5 in SE mode does not support dynamically linked libraries. matmul.x86 file should be created. b) Running program. To run the matmul program from /opt/gem5 directory type in the following (the input parameter for matmul program is specified using --options switch): <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< [milenka@EB245-mhealth3 gem5]$ ./build/X86/gem5.opt configs/example/se.py -c uah.tests/matmul.x86 --options="16" gem5 Simulator System. http://gem5.org gem5 is copyrighted software; use the --copyright option for details. gem5 compiled Jul 6 2013 16:46:21 gem5 started Jul 9 2013 18:01:01 gem5 executing on EB245-mhealth3 command line: ./build/X86/gem5.opt configs/example/se.py -c uah.tests/matmul.x86 --options=16 Global frequency set at 1000000000000 ticks per second 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 **** REAL SIMULATION **** info: Entering event queue @ 0. Starting simulation... warn: instruction 'fldcw_Mw' unimplemented 0.000000 seconds. hack: be nice to actually delete the event here Exiting @ tick 799847000 because target called exit() [milenka@EB245-mhealth3 gem5]$ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The program completes in tick 799847000. The output file is created in the directory (matrices.txt). You can examine the stats.txt file as follows: <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< [milenka@EB245-mhealth3 gem5]$ head m5out/stats.txt [milenka@EB245-mhealth3 gem5]$ head m5out/stats.txt ---------- Begin Simulation Statistics ---------- sim_seconds 0.000800 # Number of seconds simulated sim_ticks 799847000 # Number of ticks simulated final_tick 799847000 # Number of ticks from beginning of simulation (restored from checkpoints and never reset) sim_freq 1000000000000 # Frequency of simulated ticks host_inst_rate 1485825 # Simulator instruction rate (inst/s) host_op_rate 2633661 # Simulator op (including micro ops) rate (op/s) host_tick_rate 1575547741 # Simulator tick rate (ticks/s) host_mem_usage 761740 # Number of bytes of host memory used >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. Pthread programs / SE mode / ARM ISA ---------------------------------------- a) Compilation of programs To run parallel programs in SE mode, a special light-weight pthread library called m5threads is required. You can find this library in http://repo.gem5.org/ (m5threads). Our copy is available at /opt/m5threads-1118adb7cdad_MOD. Note: We needed to modify pthreads.c file and the makefile to remove omp test examples that will not compile with the ARM cross compiler. Compilations is carried out using the following commands: <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< [milenka@EB245-mhealth3 m5threads-1118adb7cdad_MOD]$ pwd /opt/m5threads-1118adb7cdad_MOD [milenka@EB245-mhealth3 m5threads-1118adb7cdad_MOD]$ cd tests/ [milenka@EB245-mhealth3 tests]$ make -f Makefile.arm arm-none-linux-gnueabi-g++ -g -O3 -march=armv7-a -marm -c -o test_stackgrow.o test_stackgrow.cpp arm-none-linux-gnueabi-g++ -g -O3 -march=armv7-a -marm -c -o test_pthreadbasic.o test_pthreadbasic.cpp arm-none-linux-gnueabi-g++ -g -O3 -march=armv7-a -marm -c -o test_pthread.o test_pthread.cpp arm-none-linux-gnueabi-g++ -g -O3 -march=armv7-a -marm -c -o test_atomic.o test_atomic.cpp arm-none-linux-gnueabi-g++ -g -O3 -march=armv7-a -marm -c -o test_barrier.o test_barrier.cpp arm-none-linux-gnueabi-g++ -g -O3 -march=armv7-a -marm -c -o test_lock.o test_lock.cpp arm-none-linux-gnueabi-g++ -g -O3 -march=armv7-a -marm -c -o test_malloc.o test_malloc.cpp arm-none-linux-gnueabi-g++ -g -O3 -march=armv7-a -marm -c -o test_sieve.o test_sieve.cpp arm-none-linux-gnueabi-g++ -g -O3 -march=armv7-a -marm -c -o test___thread.o test___thread.cpp arm-none-linux-gnueabi-gcc -g -O3 -march=armv7-a -marm -c ../pthread.c -o ../pthread.o arm-none-linux-gnueabi-g++ -static -o test_stackgrow test_stackgrow.o ../pthread.o arm-none-linux-gnueabi-g++ -static -o test_stackgrow_p test_stackgrow.o -lpthread arm-none-linux-gnueabi-g++ -static -o test_pthreadbasic test_pthreadbasic.o ../pthread.o arm-none-linux-gnueabi-g++ -static -o test_pthreadbasic_p test_pthreadbasic.o -lpthread arm-none-linux-gnueabi-g++ -static -o test_pthread test_pthread.o ../pthread.o arm-none-linux-gnueabi-g++ -static -o test_pthread_p test_pthread.o -lpthread arm-none-linux-gnueabi-g++ -static -o test_atomic test_atomic.o ../pthread.o arm-none-linux-gnueabi-g++ -static -o test_atomic_p test_atomic.o -lpthread arm-none-linux-gnueabi-g++ -static -o test_barrier test_barrier.o ../pthread.o arm-none-linux-gnueabi-g++ -static -o test_barrier_p test_barrier.o -lpthread arm-none-linux-gnueabi-g++ -static -o test_lock test_lock.o ../pthread.o arm-none-linux-gnueabi-g++ -static -o test_lock_p test_lock.o -lpthread arm-none-linux-gnueabi-g++ -static -o test_malloc test_malloc.o ../pthread.o arm-none-linux-gnueabi-g++ -static -o test_malloc_p test_malloc.o -lpthread arm-none-linux-gnueabi-g++ -static -o test_sieve test_sieve.o ../pthread.o arm-none-linux-gnueabi-g++ -static -o test_sieve_p test_sieve.o -lpthread arm-none-linux-gnueabi-g++ -static -o test___thread test___thread.o ../pthread.o arm-none-linux-gnueabi-g++ -static -o test___thread_p test___thread.o -lpthread [milenka@EB245-mhealth3 tests]$ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> b) Running programs We will run one of the test programs provided with m5threads. The test program is test___thread and is located in /opt/m5threads-1118adb7cdad_MOD/tests/. We create 4 threads (--options="4", and also specify machine with 4 processor cores (-n 4). <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< [milenka@EB245-mhealth3 gem5]$ ./build/ARM/gem5.opt configs/example/se.py -n 4 -c /opt/m5threads-1118adb7cdad_MOD/tests/ test___thread --options="4" gem5 Simulator System. http://gem5.org gem5 is copyrighted software; use the --copyright option for details. gem5 compiled Feb 13 2013 11:10:22 gem5 started Jul 9 2013 18:15:07 gem5 executing on EB245-mhealth3 command line: ./build/ARM/gem5.opt configs/example/se.py -n 4 -c /opt/m5threads-1118adb7cdad_MOD/tests/test___thread --options=4 Global frequency set at 1000000000000 ticks per second 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003 **** REAL SIMULATION **** info: Entering event queue @ 0. Starting simulation... Starting 4 threads... warn: User mode does not have SPSR warn: User mode does not have SPSR warn: User mode does not have SPSR warn: User mode does not have SPSR warn: User mode does not have SPSR warn: User mode does not have SPSR warn: ignoring syscall futex(0, 589408, ...) &local[1]=0x400224c8 &local[0]=0x400004c8 &local[3]=0x400624c8 &local[2]=0x400424c8 local[0] = 1031 local[1] = 1032 local[2] = 1033 local[3] = 1034 hack: be nice to actually delete the event here Exiting @ tick 87428500 because target called exit() [milenka@EB245-mhealth3 gem5]$ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To see the traces you can run the following command: <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< [milenka@EB245-mhealth3 gem5]$ ./build/ARM/gem5.opt --debug-flags=Exec,ExecTicks configs/example/se.py -n 4 -c /opt/m5threads-1118adb7cdad_MOD/tests/test___thread --options="4" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> You will see instructions executing at different CPU cores (see below one snippet) of the trace. <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< .... 3660000: system.cpu3 T0 : 0x21f7c : bcc : IntAlu : Predicated False 3660000: system.cpu2 T0 : 0x343ec.2 : str_uop r5, [r34, #32] : MemWrite : D=0x0000000000000002 A=0x40061788 3660000: system.cpu1 T0 : 0x21ec0.5 : ldr_uop r35, [r34, #16] : MemRead : D=0x000000000006c914 A=0x4004124c 3660500: system.cpu1 T0 : 0x21ec0.6 : addi_uop sp, sp, #20 : IntAlu : D=0x0000000040041250 3660500: system.cpu2 T0 : 0x343ec.3 : str_uop r6, [r34, #28] : MemWrite : D=0x0000000000000008 A=0x4006178c 3660500: system.cpu3 T0 : 0x21f80 : rsb r3, r1, #0 : IntAlu : D=0x00000000fff73160 3660500: system.cpu0 T0 : 0x8e3c : sub sp, sp, #12 : IntAlu : D=0x00000000befffd40 3661000: system.cpu0 T0 : 0x8e40 : ldr r6, [pc, #188] : MemRead : D=0x0000000000000008 A=0x8f04 3661000: system.cpu3 T0 : 0x21f84 : ands r3, r3, #3 : IntAlu : D=0x0000000000000001 3661000: system.cpu2 T0 : 0x343ec.4 : str_uop r7, [r34, #24] : MemWrite : D=0x00000000400424c8 A=0x40061790 3661000: system.cpu1 T0 : 0x21ec0.7 : uopReg_uop r8, r35 : IntAlu : D=0x000000000006c914 3661500: system.cpu1 T0 : 0x21ec4 : bx : IntAlu : 3661500: system.cpu2 T0 : 0x343ec.5 : str_uop r8, [r34, #20] : MemWrite : D=0x0000000000000002 A=0x40061794 3661500: system.cpu3 T0 : 0x21f88 : beq : IntAlu : 3661500: system.cpu0 T0 : 0x8e44 : mov r5, r0 : IntAlu : D=0x0000000000000000 3662000: system.cpu0 T0 : 0x8e48 : mrc r4, r57 : IntAlu : D=0x00000000400004c0 3662000: system.cpu3 T0 : 0x21fac : eor r12, r0, r1 : IntAlu : D=0x00000000400eea68 3662000: system.cpu2 T0 : 0x343ec.6 : str_uop r9, [r34, #16] : MemWrite : D=0x0000000000000000 A=0x40061798 3662000: system.cpu1 T0 : 0x34464 : tsts r4, #32768 : IntAlu : D=0x0000000000000001 3662500: system.cpu1 T0 : 0x34468 : strne r6, [fp, #-1184] : MemWrite : Predicated False 3662500: system.cpu2 T0 : 0x343ec.7 : str_uop r10, [r34, #12] : MemWrite : D=0x0000000000008e38 A=0x4006179c 3662500: system.cpu3 T0 : 0x21fb0 : tsts r12, #3 : IntAlu : D=0x0000000000000001 .... >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 4. Pthread / SE / X86 ---------------------- <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< [milenka@EB245-mhealth3 gem5]$ ./build/X86/gem5.opt configs/example/se.py -n 4 -c /opt/m5threads-1118adb7cdad_X86/tests/test___thread --options="4" gem5 Simulator System. http://gem5.org gem5 is copyrighted software; use the --copyright option for details. gem5 compiled Jul 6 2013 16:46:21 gem5 started Jul 9 2013 22:08:38 gem5 executing on EB245-mhealth3 command line: ./build/X86/gem5.opt configs/example/se.py -n 4 -c /opt/m5threads-1118adb7cdad_X86/tests/test___thread --options=4 Global frequency set at 1000000000000 ticks per second 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003 **** REAL SIMULATION **** info: Entering event queue @ 0. Starting simulation... warn: instruction 'fldcw_Mw' unimplemented Starting 4 threads... &local[1]=0x2aaaaaacd038 &local[0]=0x2aaaaaaab038 &local[2]=0x2aaaaaaed038 &local[3]=0x2aaaaab0d038 local[0] = 1031 local[1] = 1032 local[2] = 1033 local[3] = 1034 hack: be nice to actually delete the event here Exiting @ tick 137155500 because target called exit() >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 5. Pthread matrix multiplication -------------------------------- a) Compiling a pthread program. We consider a pthread matrix multiplication program written by Wesley Kos. The source code is in files are matmulti.c and matmulti.h. (note: this code should be cleaned). <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< #Compilation: [milenka@EB245-mhealth3 uah.tests]$ arm-none-linux-gnueabi-gcc -static -c matmult-dyn.c #Linking: [milenka@EB245-mhealth3 uah.tests]$ arm-none-linux-gnueabi-gcc -o matmult-dyn.arm matmult-dyn.o pthread.o --static [milenka@EB245-mhealth3 gem5]$ ./build/ARM/gem5.opt configs/example/se.py -n 5 -c uah.tests/matmult-dyn.arm --options="16 4" gem5 Simulator System. http://gem5.org gem5 is copyrighted software; use the --copyright option for details. gem5 compiled Feb 13 2013 11:10:22 gem5 started Jul 10 2013 15:13:19 gem5 executing on EB245-mhealth3 command line: ./build/ARM/gem5.opt configs/example/se.py -n 5 -c uah.tests/matmult-dyn.arm --options=16 4 Global frequency set at 1000000000000 ticks per second 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003 0: system.remote_gdb.listener: listening for remote gdb #4 on port 7004 **** REAL SIMULATION **** info: Entering event queue @ 0. Starting simulation... warn: User mode does not have SPSR warn: User mode does not have SPSR warn: User mode does not have SPSR warn: User mode does not have SPSR warn: User mode does not have SPSR warn: User mode does not have SPSR warn: User mode does not have SPSR warn: User mode does not have SPSR hack: be nice to actually delete the event here Exiting @ tick 97064000 because target called exit() >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>