Linux TLS pointer access reference emulation

Maciej W. Rozycki macro at mips.com
Wed Feb 9 19:40:59 UTC 2005


Hello,

 I have published a Linux patch and a small test user program that I've 
used for performance evaluation of a few possible TLS pointer access 
methods.  The software is available at: 
"ftp://ftp.linux-mips.org/pub/linux/mips/people/macro/tls.tar.bz2". The 
patch implements an emulation of "rdhwr $2, $4" and syscall #0x10000, both 
retrieving a member of "struct thread_info" associated with the current 
process (the patch uses an arbitrary one; of course for the TLS pointer 
that should be replaced with a meaningful struct field).

 The patch applies to Linux 2.6.9-rc1, specifically to the malta CVS 
repository at linux-mips.org as of Oct 20th, 2004.  It should work with 
the corresponding version from the main repository as well, but using it 
with the current revision requires adjusting it to these synthesized TLB 
handlers.

 The patch has its shortcomings, most notably it's been written for the 
32-bit kernel only.  For 64-bit ones it needs to be aware of the XTLB 
refill handler.  It may actually be done quite nicely with these 
synthesized handlers; also avoiding the need to fetch the shadow of the 
EBase cp0 register.

 The userland software consists of a small program that benchmarks the 
available methods in tight loops.  Keeping caches warm this should provide 
a reasonable optimistic execution time estimate.  The program expects two 
arguments, a CPU frequency (which you can obtain from `dmesg' or failing 
this -- from your system's specs) and a number of loops to execute.  It 
provides a number of outputs which are essentially raw as it's not really 
been meant for general use, but most importantly you are after "cycle 
count" reports for "scall" and "rdhwr" (these should be self explaining); 
perhaps "instr" as well (which counts cycles used for a single instruction 
and may not be accurate depending on the implementation of your 
processor's execution pipeline(s)).

 There are actually two programs included -- "time-0" and "time-1"; the 
former is what I've used for benchmarking and the latter is mainly for 
verification of proper operation with VIVT I-caches (but its output is 
meaningful, too).

 With Linux from the malta repository the programs can be trivially 
modified to benchmark a full instruction emulation.  This version of Linux 
already emulates "rdhwr" for other registers, so all that has to be done 
is to replace "rdhwr $2, $4" with e.g. "rdhwr $2, $2" in rdhwr.h; that's 
how I did these additional benchmarks for Daniel.

 Alongside the software there are a few reports provided that I've 
obtained with a Malta board for CPUs I've had immediately available on 
core cards.

 Please let me know if you have any further questions regarding this 
package.

  Maciej



More information about the mips-tls mailing list