Chombo + EB  3.2
Public Member Functions | Static Public Member Functions | Private Member Functions | Static Private Member Functions | Private Attributes | Static Private Attributes | List of all members
TraceTimer Class Reference

#include <CH_Timer.H>

Public Member Functions

virtual ~TraceTimer ()
void start (char *mutex)
unsigned long long int stop (char *mutex)
void leafStart ()
void leafStop ()
unsigned long long int time () const
int rank () const
long long int count () const
void prune ()
bool isPruned () const
const std::vector< TraceTimer * > & children () const

Static Public Member Functions

static void report (bool a_closeAfter=false)
static void reportName (std::ostream &out, const char *name)
static void reset ()
static int initializer ()
static TraceTimergetTimer (const char *name)
static void PruneTimersParentChildPercent (double percent)
static void sampleMemUsage (const char *a_name)
static void sampleMemUsage2 ()
static void reportPeak (FILE *out)
static const char * currentTimer ()
static bool timersOn ()

Private Member Functions

 TraceTimer (const char *a_name, TraceTimer *parent, int thread_id)
void reportTree (FILE *out, const TraceTimer &node, int depth)
const TraceTimeractiveChild () const
void macroTest ()
void macroTest2 ()
void currentize () const
int computeRank () const

Static Private Member Functions

static bool find (std::list< const TraceTimer *> &a_trace, const TraceTimer *a_target)
static void reportFullTree (FILE *out, const TraceTimer &timer, unsigned long long int totalTime, int depth, TraceTimer **stack_bottom)
static void reportOneTree (FILE *out, const TraceTimer &timer)
static void updateMemory (TraceTimer &a_timer)
static void subReport (FILE *out, const char *header, unsigned long long int totalTime)
static void reset (TraceTimer &timer)
static void PruneTimersParentChildPercent (double threshold, TraceTimer *parent)

Private Attributes

bool m_pruned
std::vector< TraceTimer * > m_children
const char * m_name
long long int m_count
unsigned long long int m_accumulated_WCtime
unsigned long long int m_last_WCtime_stamp
int m_rank
int m_thread_id
unsigned int m_memoryMin
unsigned int m_memoryMax
unsigned long long int m_flopEnter
unsigned long long int m_flops

Static Private Attributes

static std::vector< TraceTimer * > s_roots
static std::vector< TraceTimer * > s_currentTimer
static long long int s_peak
static TraceTimers_peakTimer
static bool s_memorySampling
static bool s_tracing

Detailed Description

TraceTimer class is a self-tracing code instrumentation system

TraceTimer class is a self-tracing code instrumentation system for Chombo (or any other package really). The user interface is specified by a small set of macros. The usage model is that you just leave these timers in the code, for good. Initially, your application will have 'main' and a few hewavy functions instrumented, and the lower level Chombo library instrumentation. As your tool or application matures, it will garner a larger set of instrumentation giving clear views of your code performance. After a routine has been cleverly and lovingly optimized, you leave in the timers, to spot when some later bug fix or improvement undoes your previous labors.

You should never need to use or interact with the the classes TraceTimer or AutoStart. Use the macros. They call the right functions and classes for you.

The first macro is what people will use the most:

 This is the simplest interface for timers.  you place this macro call in a function
 you wish to be timed.  It handles making the timer, calling 'start' when you
 enter the function, and calling 'stop' when you leave the function.  A good
 idea is to use a 'label' specific enough to be unambiguous without being
 overwhelming.  for instance:
void AMRLevelPolytropicGas::define(AMRLevel* a_coarserLevelPtr,
const ProblemDomain& a_problemDomain,
int a_level,
int a_refRatio)
 In this case, we have a class with many constructors and define functions that
 all funnel into a single general function.  We can just call this 'define' and
 not worry about naming/instrumenting all the different overloaded instances. If
 you slip up and use the same label twice, that is not a real problem, the two
 locations will be timed and tracked properly (even if one is a sibling or parent
 of the other). The only place it will make things a little harder is in the output
 where you might have the same name show up and look confusing.
 In serial, you will see a file called <em>time.table</em> (in parallel, you will get a
 <em>time.table.n</em> (where n is the rank number) files).  If
 you want fewer files, you can do
 setenv CH_OUTPUT_INTERVAL nproc
 and it will only output every nproc processors time.table.n files
 (where n%nproc == 0).     I won't go into this file
 format.  It is kind of gprof-ish, with what I consider improvements.  The real
 benefit here is profiling that understands our Chombo context, a smaller information
 set to observe, and the fact that, so far in my testing, the timers have negligible
 impact on the run time or memory use of the code.
 By default, Chombo compiles in the instructions for the timers wherever the macros
 appear.  If the compiler macro <b>CH_NTIMER</b> is defined, then all the CH_TIME* macros
 evaluate to empty expressions at compile time.

 \par So, you put some CH_TIME calls in your code and ran it, and nothing happened:
 Chombo looks for the environment variable <b>CH_TIMER</b>. If it is set to anything (even
 if it is set to 'false' or 'no' or whatever) then the timers will be active and
 reporting will happen.  If this environment variable is not set, then all the timers
 check a bool and return after doing nothing.
 One point of interest with using the environment variable: In parallel jobs using
 mpich, only processor 0 inherits the environment variables from the shell where
 you invoke 'mpirun', the rest read your .cshrc (.bashrc, etc.) file to get their
 environment.  To time all your processes, you need to make sure the <b>CH_TIMER</b>
 environment variable gets to all your processes.

 \par Auto hierarchy:
 The timers automatically figure out their parent/child relationships.  They
 also can be placed in template code.  This has some consequences.  First,
 if you have a low level function instrumented that has no timers near it in
 the code call stack, you will see it show up as a child of a high level timer.
 the root timer "main" will catch all orphaned timers.  So, even though you
 might make no call to, say, 'exchange' in your 'main' function, you might
 very well call a function, that calls a function, that calls 'exchange'. Since
 no code in between was instrumented, this exchange is accounted for at 'main'.
 This might look strange, but it should prove very powerful. An expensive orphan
 is exactly where you should consider some more timers, or reconsidering code

  For performance reasons, child timers have only one parent.  As a consequence
  each CH_TIME("label") label can show up at multiple places in your output. Each
  instance has it's own timer.  So, each path through the call graph that arrives
  at a low-level function has a unique lineage, with it's own counter and time.
  Thus, I can instrument LevelData::copyTo once, but copyTo can appear in many
  places in the time.table file.

 The next level up in complexity is the set of *four* macros for when you want
 sub-function resolution in your timers. For instance, in a really huge function
 that you have not figured out how to re-factor, or built with lots of bad cut n paste
 code 're-use'.
CH_TIMER("child1", t1);
CH_TIMER("child2", t2);
 CH_TIMERS has the same semantic as CH_TIME, except that you can declare an
 arbitrary number of children after it in the same function scope.  The
 children here do not autostart and autostop, you have to tell them where to
 start and stop timing.  The children can themselves be parents for timers
 in called functions, of course. The children obey a set of mutual exclusions. The
 following generate run time errors:
 - double start called
 - double stop called
 - start called when another child is also started
 - you leave the function with a child not stopped

 the following will generate compile time errors:
 - more than one CH_TIME macro in a function
 - invoking CH_TIMER("child", t) without having first invoked CH_TIMERS
 - re-using the timer handle ie. CH_TIMER("bobby", t1); CH_TIMER("sally", t1)
 - mixing CH_TIME macro with CH_TIMER
 - mixing CH_TIME macro with CH_TIMERS

 You do not have to put any calls in your main routine to activate the clocks
 or generate a report at completion, this is handled with static iniitalization
 and an atexit function.
 There is a larger argument of manual instrumentation being counter to good development.
 Profiling the code is supposed to tell you where to expend your optimization effort.
 Manual instrumentation opens the door to people wasting time *assuming* what parts of the
 code are going to take up lots of time and instrumenting them, before seeing any real
 performance data.  Good judgement is needed.  We have a body of knowledge about Chombo
 that will inform us about a good minimal first set of functions to instrument.

Constructor & Destructor Documentation

◆ ~TraceTimer()

virtual TraceTimer::~TraceTimer ( )

◆ TraceTimer()

TraceTimer::TraceTimer ( const char *  a_name,
TraceTimer parent,
int  thread_id 

Member Function Documentation

◆ start()

void TraceTimer::start ( char *  mutex)

Referenced by AutoStart::AutoStart().

◆ stop()

unsigned long long int TraceTimer::stop ( char *  mutex)

◆ report()

static void TraceTimer::report ( bool  a_closeAfter = false)

◆ reportName()

static void TraceTimer::reportName ( std::ostream &  out,
const char *  name 

◆ reset() [1/2]

static void TraceTimer::reset ( )

◆ leafStart()

void TraceTimer::leafStart ( )

References ch_flops, and ch_ticks().

Referenced by AutoStartLeaf::AutoStartLeaf().

◆ leafStop()

void TraceTimer::leafStop ( )

References ch_flops, and ch_ticks().

◆ time()

unsigned long long int TraceTimer::time ( ) const

◆ rank()

int TraceTimer::rank ( ) const

◆ count()

long long int TraceTimer::count ( ) const

◆ prune()

void TraceTimer::prune ( )

◆ isPruned()

bool TraceTimer::isPruned ( ) const

◆ initializer()

static int TraceTimer::initializer ( )

◆ getTimer()

static TraceTimer* TraceTimer::getTimer ( const char *  name)

◆ children()

const std::vector<TraceTimer*>& TraceTimer::children ( ) const

◆ PruneTimersParentChildPercent() [1/2]

static void TraceTimer::PruneTimersParentChildPercent ( double  percent)

◆ sampleMemUsage()

static void TraceTimer::sampleMemUsage ( const char *  a_name)

◆ sampleMemUsage2()

static void TraceTimer::sampleMemUsage2 ( )

◆ reportPeak()

static void TraceTimer::reportPeak ( FILE *  out)

◆ currentTimer()

static const char* TraceTimer::currentTimer ( )

◆ timersOn()

static bool TraceTimer::timersOn ( )

◆ find()

static bool TraceTimer::find ( std::list< const TraceTimer *> &  a_trace,
const TraceTimer a_target 

◆ reportTree()

void TraceTimer::reportTree ( FILE *  out,
const TraceTimer node,
int  depth 

◆ activeChild()

const TraceTimer* TraceTimer::activeChild ( ) const

◆ macroTest()

void TraceTimer::macroTest ( )

◆ macroTest2()

void TraceTimer::macroTest2 ( )

◆ currentize()

void TraceTimer::currentize ( ) const

◆ computeRank()

int TraceTimer::computeRank ( ) const

◆ reportFullTree()

static void TraceTimer::reportFullTree ( FILE *  out,
const TraceTimer timer,
unsigned long long int  totalTime,
int  depth,
TraceTimer **  stack_bottom 

◆ reportOneTree()

static void TraceTimer::reportOneTree ( FILE *  out,
const TraceTimer timer 

◆ updateMemory()

static void TraceTimer::updateMemory ( TraceTimer a_timer)

◆ subReport()

static void TraceTimer::subReport ( FILE *  out,
const char *  header,
unsigned long long int  totalTime 

◆ reset() [2/2]

static void TraceTimer::reset ( TraceTimer timer)

◆ PruneTimersParentChildPercent() [2/2]

static void TraceTimer::PruneTimersParentChildPercent ( double  threshold,
TraceTimer parent 

Member Data Documentation

◆ s_roots

std::vector<TraceTimer*> TraceTimer::s_roots

◆ s_currentTimer

std::vector<TraceTimer*> TraceTimer::s_currentTimer

◆ s_peak

long long int TraceTimer::s_peak

◆ s_peakTimer

TraceTimer* TraceTimer::s_peakTimer

◆ m_pruned

bool TraceTimer::m_pruned

◆ m_parent

TraceTimer* TraceTimer::m_parent

◆ m_children

std::vector<TraceTimer*> TraceTimer::m_children

◆ m_name

const char* TraceTimer::m_name

◆ m_count

long long int TraceTimer::m_count

◆ m_accumulated_WCtime

unsigned long long int TraceTimer::m_accumulated_WCtime

◆ m_last_WCtime_stamp

unsigned long long int TraceTimer::m_last_WCtime_stamp

◆ m_rank

int TraceTimer::m_rank

◆ m_thread_id

int TraceTimer::m_thread_id

◆ s_memorySampling

bool TraceTimer::s_memorySampling

◆ m_memoryMin

unsigned int TraceTimer::m_memoryMin

◆ m_memoryMax

unsigned int TraceTimer::m_memoryMax

◆ m_flopEnter

unsigned long long int TraceTimer::m_flopEnter

◆ m_flops

unsigned long long int TraceTimer::m_flops

◆ s_tracing

bool TraceTimer::s_tracing

The documentation for this class was generated from the following file: