TraceTimer Class Reference

#include <CH_Timer.H>

List of all members.

Detailed Description

TraceTimer class is a self-tracing code instrumentation system

TraceTimer class is a self-tracing code instrumentation system for Chombo (or any other package really). The user interface is specified by a small set of macros. The usage model is that you just leave these timers in the code, for good. Initially, your application will have 'main' and a few hewavy functions instrumented, and the lower level Chombo library instrumentation. As your tool or application matures, it will garner a larger set of instrumentation giving clear views of your code performance. After a routine has been cleverly and lovingly optimized, you leave in the timers, to spot when some later bug fix or *improvement* undoes your previous labors.

You should never need to use or interact with the the classes TraceTimer or AutoStart. Use the macros. They call the right functions and classes for you.
The first macro is what people will use the most:

This is the simplest interface for timers. you place this macro call in a function you wish to be timed. It handles making the timer, calling 'start' when you enter the function, and calling 'stop' when you leave the function. A good idea is to use a 'label' specific enough to be unambiguous without being overwhelming. for instance:

    void AMRLevelPolytropicGas::define(AMRLevel*            a_coarserLevelPtr,
                                       const ProblemDomain& a_problemDomain,
                                       int                  a_level,
                                       int                  a_refRatio)

In this case, we have a class with many constructors and define functions that all funnel into a single general function. We can just call this 'define' and not worry about naming/instrumenting all the different overloaded instances. If you slip up and use the same label twice, that is not a real problem, the two locations will be timed and tracked properly (even if one is a sibling or parent of the other). The only place it will make things a little harder is in the output where you might have the same name show up and look confusing.

In serial, you will see a file called time.table (in parallel, you will get a time.table.n (where n is the rank number) files). If you want fewer files, you can do setenv CH_OUTPUT_INTERVAL nproc and it will only output every nproc processors time.table.n files (where nnproc == 0). I won't go into this file format. It is kind of gprof-ish, with what I consider improvements. The real benefit here is profiling that understands our Chombo context, a smaller information set to observe, and the fact that, so far in my testing, the timers have negligible impact on the run time or memory use of the code.

By default, Chombo compiles in the instructions for the timers wherever the macros appear. If the compiler macro CH_NTIMER is defined, then all the CH_TIME* macros evaluate to empty expressions at compile time.

So, you put some CH_TIME calls in your code and ran it, and nothing happened:
Chombo looks for the environment variable CH_TIMER. If it is set to anything (even if it is set to 'false' or 'no' or whatever) then the timers will be active and reporting will happen. If this environment variable is not set, then all the timers check a bool and return after doing nothing.
One point of interest with using the environment variable: In parallel jobs using mpich, only processor 0 inherits the environment variables from the shell where you invoke 'mpirun', the rest read your .cshrc (.bashrc, etc.) file to get their environment. To time all your processes, you need to make sure the CH_TIMER environment variable gets to all your processes.
Auto hierarchy:
The timers automatically figure out their parent/child relationships. They also can be placed in template code. This has some consequences. First, if you have a low level function instrumented that has no timers near it in the code call stack, you will see it show up as a child of a high level timer. the root timer "main" will catch all orphaned timers. So, even though you might make no call to, say, 'exchange' in your 'main' function, you might very well call a function, that calls a function, that calls 'exchange'. Since no code in between was instrumented, this exchange is accounted for at 'main'. This might look strange, but it should prove very powerful. An expensive orphan is exactly where you should consider some more timers, or reconsidering code design.
For performance reasons, child timers have only one parent. As a consequence each CH_TIME("label") label can show up at multiple places in your output. Each instance has it's own timer. So, each path through the call graph that arrives at a low-level function has a unique lineage, with it's own counter and time. Thus, I can instrument LevelData::copyTo once, but copyTo can appear in many places in the time.table file.
The next level up in complexity is the set of *four* macros for when you want sub-function resolution in your timers. For instance, in a really huge function that you have not figured out how to re-factor, or built with lots of bad cut n paste code 're-use'.
     CH_TIMER("child1", t1);
     CH_TIMER("child2", t2);

CH_TIMERS has the same semantic as CH_TIME, except that you can declare an arbitrary number of children after it in the same function scope. The children here do not autostart and autostop, you have to tell them where to start and stop timing. The children can themselves be parents for timers in called functions, of course. The children obey a set of mutual exclusions. The following generate run time errors:

the following will generate compile time errors:

You do not have to put any calls in your main routine to activate the clocks or generate a report at completion, this is handled with static iniitalization and an atexit function.

There is a larger argument of manual instrumentation being counter to good development. Profiling the code is supposed to tell you where to expend your optimization effort. Manual instrumentation opens the door to people wasting time *assuming* what parts of the code are going to take up lots of time and instrumenting them, before seeing any real performance data. Good judgement is needed. We have a body of knowledge about Chombo that will inform us about a good minimal first set of functions to instrument.

Public Member Functions

virtual ~TraceTimer ()
void start (char *mutex)
unsigned long long int stop (char *mutex)
void leafStart ()
void leafStop ()
unsigned long long int time () const
int rank () const
long long int count () const
void prune ()
bool isPruned () const
const std::vector< TraceTimer * > & children () const

Static Public Member Functions

static void report (bool a_closeAfter=false)
static void reset ()
static int initializer ()
static TraceTimergetTimer (const char *name)
static void PruneTimersParentChildPercent (double percent)
static void sampleMemUsage ()
static const char * currentTimer ()

Private Member Functions

 TraceTimer (const char *a_name, TraceTimer *parent, int thread_id)
void reportTree (FILE *out, const TraceTimer &node, int depth)
const TraceTimeractiveChild () const
void macroTest ()
void macroTest2 ()
void currentize () const
int computeRank () const

Static Private Member Functions

static void reportFullTree (FILE *out, const TraceTimer &timer, unsigned long long int totalTime, int depth)
static void reportOneTree (FILE *out, const TraceTimer &timer)
static void reportMemoryOneTree (FILE *out, const TraceTimer &timer)
static void subReport (FILE *out, const char *header, unsigned long long int totalTime)
static void reset (TraceTimer &timer)
static void PruneTimersParentChildPercent (double threshold, TraceTimer *parent)

Private Attributes

bool m_pruned
std::vector< TraceTimer * > m_children
const char * m_name
long long int m_count
unsigned long long int m_accumulated_WCtime
unsigned long long int m_last_WCtime_stamp
int m_rank
int m_thread_id
long long int m_memory
long long int m_last_Memory_Stamp
long long int m_peak

Static Private Attributes

static std::vector< TraceTimer * > s_roots
static std::vector< TraceTimer * > s_currentTimer
static bool s_traceMemory
static long long int s_peak
static TraceTimers_peakTimer

Constructor & Destructor Documentation

virtual TraceTimer::~TraceTimer (  )  [virtual]

TraceTimer::TraceTimer ( const char *  a_name,
TraceTimer parent,
int  thread_id 
) [private]

Member Function Documentation

void TraceTimer::start ( char *  mutex  ) 

Referenced by AutoStart::AutoStart().

unsigned long long int TraceTimer::stop ( char *  mutex  ) 

Referenced by AutoStart::~AutoStart().

static void TraceTimer::report ( bool  a_closeAfter = false  )  [static]

static void TraceTimer::reset (  )  [static]

void TraceTimer::leafStart (  )  [inline]

void TraceTimer::leafStop (  )  [inline]

unsigned long long int TraceTimer::time (  )  const [inline]

References m_accumulated_WCtime.

int TraceTimer::rank (  )  const [inline]

References m_rank.

long long int TraceTimer::count (  )  const [inline]

References m_count.

void TraceTimer::prune (  ) 

bool TraceTimer::isPruned (  )  const [inline]

References m_pruned.

static int TraceTimer::initializer (  )  [static]

static TraceTimer* TraceTimer::getTimer ( const char *  name  )  [static]

const std::vector<TraceTimer*>& TraceTimer::children (  )  const

static void TraceTimer::PruneTimersParentChildPercent ( double  percent  )  [static]

static void TraceTimer::sampleMemUsage (  )  [static]

static const char* TraceTimer::currentTimer (  )  [inline, static]

References s_currentTimer.

void TraceTimer::reportTree ( FILE *  out,
const TraceTimer node,
int  depth 
) [private]

const TraceTimer* TraceTimer::activeChild (  )  const [private]

void TraceTimer::macroTest (  )  [private]

void TraceTimer::macroTest2 (  )  [private]

void TraceTimer::currentize (  )  const [private]

int TraceTimer::computeRank (  )  const [private]

static void TraceTimer::reportFullTree ( FILE *  out,
const TraceTimer timer,
unsigned long long int  totalTime,
int  depth 
) [static, private]

static void TraceTimer::reportOneTree ( FILE *  out,
const TraceTimer timer 
) [static, private]

static void TraceTimer::reportMemoryOneTree ( FILE *  out,
const TraceTimer timer 
) [static, private]

static void TraceTimer::subReport ( FILE *  out,
const char *  header,
unsigned long long int  totalTime 
) [static, private]

static void TraceTimer::reset ( TraceTimer timer  )  [static, private]

static void TraceTimer::PruneTimersParentChildPercent ( double  threshold,
TraceTimer parent 
) [static, private]

Member Data Documentation

std::vector<TraceTimer*> TraceTimer::s_roots [static, private]

std::vector<TraceTimer*> TraceTimer::s_currentTimer [static, private]

Referenced by currentTimer().

bool TraceTimer::s_traceMemory [static, private]

long long int TraceTimer::s_peak [static, private]

TraceTimer* TraceTimer::s_peakTimer [static, private]

bool TraceTimer::m_pruned [private]

Referenced by isPruned(), leafStart(), and leafStop().

std::vector<TraceTimer*> TraceTimer::m_children [private]

const char* TraceTimer::m_name [private]

long long int TraceTimer::m_count [private]

Referenced by count(), and leafStart().

unsigned long long int TraceTimer::m_accumulated_WCtime [private]

Referenced by leafStop(), and time().

unsigned long long int TraceTimer::m_last_WCtime_stamp [private]

Referenced by leafStart(), and leafStop().

int TraceTimer::m_rank [mutable, private]

Referenced by rank().

int TraceTimer::m_thread_id [private]

long long int TraceTimer::m_memory [private]

long long int TraceTimer::m_last_Memory_Stamp [private]

long long int TraceTimer::m_peak [private]

The documentation for this class was generated from the following file:

Generated on Fri Mar 11 12:53:44 2016 for AMRStencil by  doxygen 1.5.5