PERIXML Perf Data - PERI

PERIXML Perf Data

From PERI

Jump to: navigation, search

Contents

Introduction

The purpose of this page is to record the arguments for/against certain features of the PERIXML "data" schema. Historically, discussions have taken place in email. These are easily forgotten and re-iterated.

Details: Subversion

For more details, look in subversion

svn co https://bosshog.lbl.gov/repos/peridb

The files of interest are:

peridb/schemas/peridb-perfdata.rnc
Schema file
peridb/schemas/peridb-perfdata.txt
Textual description of approach
peridb/examples/ipm/ex1.xml,ex2-{peri,ipm}.xml
IPM conversion examples

Initial status

The initial status of the "data" parts of the schema is summarized below.

Major sections:

  • codeStatic (also referenced in runrules): Locations in the source or object code, such as file and line number; program and library names.
  • codeDynamic: Locations relative to the execution resource(s), such as the names and MPI ranks of execution nodes; process and thread identifiers.
  • metrics: Description of the performance metrics exported from this run, such as hardware counters or MPI statistics.
  • results: The actual performance results, which are either a combination of the value, metric, and 1 or more codeStatic/codeDynamic

locations OR reference(s) to an external representation of the results.

Issues

Any and all problems with the schema. Please sign your entries with four tildes ~~~~.

Suggested re-org

Instead of a "results" section, put results nested in with the dynamic/static locations. These do not need to be broken into 2 pieces, either, but rather can be nested inside each other as needed. Keep the separate metric section though.

Instead of:

<codeStatic>
   <basepath ..>
      <file ..>
        <line id="L12">
          <value>12</value>
...
<results>
 <data loc="L12">
   <values metric="M1">12.6</values>

Do:

   <loadModule>
     <function>
       <procedure>
         <loop>
           <statement line="12">
             <v n="M1">12.6</v>


Suggested additions

In no particular order, for now. Feel free to add things here.

  • groups
    • combine measurements using application-level semantics, e.g. a "phase"
  • progress / effort loops (from HPCToolkit)
    • (description from Rob Fowler) A progress loop marks application-level steps, e.g., a time step loop. Within that we identify effort loops, e.g., an convergence loop of an iterative server. Performance metrics get normalized w.r.t. these, so one can say something about the efficiency of the computation towards application progress and distinguish between cases in which things slow because you need more iterations to converge vs needing more time per iteration. One goal is to partition the blame for bad performance between the algorithm vs coding details. For instance, some bad algorithms can get really high FLOP rates, but take forever to converge.