PERI-DB Meetings 2007

From PERI

Jump to: navigation, search

Meeting Notes 2007

Contents

August 9, 2007

Shirley and Karen reported on the Petascale Tools Workshop held in Washington,DC on Aug 1-2. The purpose of the meeting was to determine priorities for future funding. Following presentations by three applications about their tool requirements and by Fred Johnson and Bart Miller, we broke up into working group on correctness tools, performance tools, scalable infrastructure, and development environment infrastructure (e.g,. Eclipse PTP). Management of performance data and of metadata about runs was brought up under scalable infrastructure and was rated Medium for probability/risk with no additional funding and Medium High for impact.

Dan and Karen reported on the Performance Tools for Petascale Computing held July 16-19 in Snowbird, Utah.

We discussed how to proceed to the next level of interoperability that will involve actual exchange of data between tools. Participants will start trying to import from a different repository (e.g., Prophesy will try to import RENCI data), using the data dictionary as a guide.

We discussed SC'07. We will do a joint demo at the RENCI booth. We will discuss details by email and in our next conference call.

We discussed whether to have a working group meeting at the September PERI meeting. The consensus was that people were busy and we didn't have sufficient reason to have a meeting at this time.

We discussed which applications we will continue or starting working with. We will continue with MILC and GTC. The TAU group is also working with S3D and will make that metadata available. We will work with new Tiger Team applications when those are assigned.

May 22, 2007

We discussed tool installation in /spin/proj/perc/TOOLS on the ORNL machines. PAPI 3.5.0, KOJAK 2.2, and TAU 2.16.3 have been installed by UTK and UOregon. We are using the convention that <toolname_latest> is a symlink pointing to the subdirectory containing the most recent version of the tool (e.g., papi_latest ->PAPI/papi-3.5.0). Under each tool subdirectory, there should be platform-specific subdirectories for each platform for which the tool is installed (e.g., papi_latest/xt3). (KOJAK is currently only installed for Jaguar and does not have platform-specific subdirectories, but we will fix this when we install the next KOJAK release). Ying will install SvPablo in a similar manner. Shirley will inform Rice and the Performance Database Working Group so that they can install their tools in a similar manner.

UOregon and RENCI each thought the other was supposed to go first in adding their examples to the data dictionary spreadsheet. UOregon will go next followed by RENCI and we will discuss the results and next steps on implementing interoperability involving actual performance data interchange in the next call.

Karen sent out a mock-up of our SciDAC 2007 poster. Kathryn would like text for the individual tool descriptions by a week from now, and screenshots of collection and analysis of GTC_s performance data for each tool by two weeks from now.

Kevin requested input sets for GTC_s for larger numbers of processors. [Since the conf call, Shirley noticed that Stephane had said in his GTC_s instructions to increase the value of mzetamax in INPUT.d to run with larger numbers of processors. Kevin tried this and was able to run with up to 256 processors on Jaguar but crashed with more than that. Shirley will ask Stephane for help with this.]

Xingfu sent Shirley a report on Prophesy analysis of GTC and sent Shirley and Ying a report on Prophesy analysis of MILC. Shirley will read the GTC report in detail, send it to the group, and ask for the GTC developers' permission before posting to the PERI website. Ying will ask the MILC developers for their permission to post the MILC report.

We decided on the following authorship for the SciDAC 2007 5-page paper:

UOregon - Kevin Huck, Sameer Shende, and Allen Malony
TAMU - Valerie Taylor and Xingfu Wu
LBNL - Dan Gunter
Portand State - Karen Karavanic and Kathryn Mohror
UTK - Shirley
UNC/RENCI - Ying Zhang
LLNL - John May 

We discussed the outline of the paper that Shirley had sent out. Dan suggested mentioning the two levels of interoperability up front and making subsections for these. Shirley will make that change. Dan will write material for the XML schema and web interface subsections. Each tool group should write material for their subsection in the tool section and for the MILC and GTC subsections (if applicable) in the application examples section.

The next call will be on Thursday, June 7, at 4pm Eastern time.

May 1, 2007

We first discussed software installation on Jaguar. We will make tool-specific subdirectories (e.g., tau-2.16.3, papi-3.5.0) under /spin/proj/perc/TOOLS and then put architecture-specific directories (e.g,. xt3) under those. We decided to use xt3 as the subdirectory name for the Jaguar backend for now. We will make <tool-name>_latest be a symbolic link pointing to the subdirectory of the latest version for each tool (e.g., tau_latest, papi_latest). papi-3.5.0 has been installed. Shirley is working on installing kojak-2.2. The TAU team will re-install tau-2.16.3 once that is done. We also cleaning up /spin/proj/perc/TOOLS by removing old directories. Shirley will inquire about the possibility of a PERI_HOME environment variable defined to point to the PERI location on different machines used by SciDAC.

Prophesy and Perftrack teams have filled in the data dictionary/spreadsheet with examples for MILC. TAU team will go next, then SvPablo/HPC Database.

We decided to use GTC as the application for illustrations on the SciDAC2007 poster. Kathryn requested screenshots of the tools showing data that have been gathered and analyzed. The machine of most interest is Jaguar.

Shirley has received the GTC_s code that has been ported to Jaguar from developer Stephane Ethier and has forwarded it to people who have requested it. The draft Joule report which contains the baseline run results on Jaguar is also available. The developers are interested in seeing more performance results.

The next call will be Tuesday, May 22, at 4pm Eastern.

April 17, 2007

We agreed to do a PERI DB poster for SciDAC 2007, emphasizing application support and interaction rather than the technical aspects of database interoperability. Shirley will let Pat Worley know. Karen and Kathryn will take the lead on the poster. Shirley will send them the "hourglass" slide that shows tools interoperating via a database.

We decided to add examples to the data dictionary/spreadsheet. Karen and Kathryn will take a first crack at that. Valerie suggested that all of use the same application on the same machine for our examples. We decided on MILC on Jacquard.

We are installing our tools in /spin/proj/perc/TOOLS on Jaguar. Shirley (or other people at UTK) will install the most recent versions of PAPI and KOJAK in /spin/proj/perc/TOOLS on Jaguar and let the TAU team know so that they can re-install TAU with thses versions. Next conference call will be Tuesday, May 1, 4pm Eastern.

April 10, 2007

Attendees: Alan Morris, Sameer Shende, Kevin Huck, Allen Malony, Xingfu Wu, Ying Zhang, Shirley Moore, Karen Karavanic, Katherine Moore, Dan Gunter

The main agenda item was to discuss the data dictionary spreadsheet. So far the dictionary was started by Valerie and Xingfu (Prophesy) and added to by Kevin (TAU PerfDMF) and Ying (RENCI HPC Database).

We discussed each category and the individual terms in detail up to Resource Information.

- User Information (good agreement)
- Application Information (good agreement)
- Executable Information
  Karen remarked that Perftrack includes the pathname for the executable.
- Run Information
  We discussed the difference between number of processors on the machine and number of processors or processes used for the run.  It's possible the numbers of processors and processes used might not be the same.  Total number of processors on the machine could/should be moved to System Information.
- System Information
- Input Information should be moved to Run Information.
- Processors per node should be moved to System Information.
- File Information should be moved to a new Build Information category.
- Compiler Information can be moved to Build Information.
- Function Information 
  to Build Information (what about dynamically loaded functions?)
  could also have finer granularity -- e.g., loop level
- Program Control Flow
  static in Prophesy --> Build Information
  dynamic in PerfDMF and HPC Database --> Run Information
- Library Information
  static --> Build Information
  dynamic --> Run Information

Karen will add the PerfTrack fields and make as many of the changes that we discussed as she can before the next conference call.

Next conference call will be Tuesday, April 17, 4pm Eastern.

March 13, 2007

In-person meeting held 1-5pm at the Emeryville Courtyard by Marriott following the March 12-13(am) PERI meeting.

Present: Shirley Moore, Karen Karavanic, John May, Allen Malony, Kevin Huck, Ying Zhang, Robert Fowler, Valerie Taylor, Bronis deSupinski, Dan Gunter, Bob Lucas, David Bailey

In the PERI meeting, Shirley had given an overview of the Performance Database working group effort. Karen and John, Allen, Valerie, and Ying each gave presentations on their tools and how they are implementing interoperation using the common schema.

Robert Fowler gave a brief presentation on a tool for grid workflow automation of performance campaigns. The campaigns include scaling and parameter sweep studies. The tool uses the Generic Service Toolkit (GST) for launching remote CLI applications.

Karen gave a brief presentation on a previous effort called PPerfGrid for federating performance databases where each site chooses what to publish when.

We asked Bob Lucas for input on what issues he would like us to address. His response was:

- how well is outside collaboration working?
- can the database be a mechanism for integrating tools (as suggested by Valerie in her Outside Perspective talk)?

How well is outside collaboration working? Valerie: Good to have applications to work with, see different perspectives, see ties with other aspects of PERI John: Found out about efforts he was not aware of, interesting to see how problems have been addressed in different ways, needs to understand what PERI wants to do with performance databases (pick one, more than one, build new one?) Karen: has heard that SciDAC/PERI are not willing to share data, she needs to be sure that her students can publish their results

Database as a mechanism for integrating tools: Bob described his "portal fantasy". There are a dozen tools among PERI researchers. It would be good to have one common way to exchange information. PERI tools + database ==> guidance

We discussed access control issues. Rob remarked that benchmark code results can be fully public but that there will be different situations with different code developers and teams. For example, QCD is a big team that compete as well as collaborate among themselves. We decided to keep performance data only accessible to the immediate code development and (if applicable) Tiger team. For TAU PerfDMF, Kevin will create a separate database with a different password for each code for which the TAU team is collecting performance data.

We discussed what approach we should use for further interoperability. The goal is to allow the different databases to exchange actual performance data. John suggested we decide on a common interchange format for performance data. Valerie suggested constructing a dictionary (analogy to Excel spreadsheet -- terms as rows, column for tool) that shows all terms for each database and where there are terms with same semantics. Kevin remarked that we will have an 80-90% solution (i.e., some loss of data in exchange). Allen suggested filters to reduce the data requested by a tool.

We discussed which code(s) to proceed with for the interoperability work. Allen suggested an easily available code with no restrictions to do a code investigation/characterization -- e.g., FACETS codes. Bob suggested Phil Collela's library codes. Shirley suggested GTC since it is a high priority code. Karen expressed concerns about permission to public results on GTC. Shirley will vet plans to use GTC with the developers and if they are agreeable and do not see a problem with publishing results, we will proceed with GTC.

Action items:

- Valerie will do initial draft of dictionary/spreadsheet for database terms and then pass it off to others (TAU team, SvPablo, PerfTrack).
- Kevin will create a separate PerfDMF database for each code.
- Shirley will vet plans to use GTC for further interoperability work with developers.

Next conference call will by 4pm Eastern on Tuesday, April 10.

February 27, 2007

Present: Allen Malony, Alan Morris, Sameer Shende, Shirley Moore, Dan Gunter, John May, Karen Karavanic

We discussed the metadata sets that have been collected so far. The TAU team has collected performance data and metadata for MILC on Jacquard and posted the metadata to the wiki. They have also collected performance data for GTC on Jacquard but metadata for this have not yet been posted. Prophesy data and metadata have been collected for MILC runs on the RENCI BG/L and the metadata posted on the wiki. Dan has loaded the TAU team's MILC metadata into the search/browse interface but was not aware of the Prophesy metadata so had not included that yet.

We discussed how we want the top-level search/browse display to look. We decided that for now to provide access to the actual performance data, we would provide a contact email and an identifier for the desired data -- e.g., a trial ID in the case of TAU PerfDMF data. We decided to add the machine name (e.g., Jacquard) to the schema and to have it appear in the top-level display. We discussed interpretation of the concurrency field. This should be number of "units of execution", as in the MILC example posted by the TAU team where concurrency gives the number of processes. We discussed how to display the node list. It will be very unwieldy to show all the nodes if there are hundreds or thousands of them. We decided an expandable tree display as is used by TAU and KOJAK would be good.

We decided that all the available metadata should be uploaded, not just a few example runs, so as to better illustrate the search interface.

Karen's group is working on targeting the PerfTrack metadata collection scripts to new platforms. The TAU team has been using their own tools to collect metadata so far.

We planned the presentation and demo for the PERI meeting. The purpose is to describe our approach and get feedback. We will request a 90-minute time slot that will be divided up as follows if it is granted:

- 15 min - intro and description of approach (Shirley and Dan)
- 15 min - PerfTrack (Karen)
- 15 min - TAU PerfDMF (Allen)
- 15 min - Prophesy (Valerie)
- 15 min - SvPablo (Ying)
- 15 min - questions and discussion

The individual tool sessions will provide demos of the actual tools for looking at the data in the databases.

We came up with the following discussion questions:

- How can we encourage tiger and code teams to use our tools and metadata interface?
- How can we encourage collection of the PERI metadata so as to ensure the a process is adopted by the code and tiger teams that preserves the needed context for performance experiments?
- Should we do more work to make the interface more seamless -- e.g., so that performance data can be automatically retrieved rather than needing to send an email?
- General input on the approach and interface

We will coordinate by email to prepare for the March 13 PERI meeting unless the need for a conference call before then arises.

February 13, 2007

Present: Allen Malony, Sameer Shende, Alan Morris, Kevin Huck, Shirley Moore, John May, Karen Karavanic, Valerie Taylor

We discussed progress with the MILC code. Kevin has built MILC on Jacquard and will collect data using TAU. Valerie asked for notes from Kevin on how to built the relevant parts of the code. The TAU group is waiting until the XML schema is finalized and stable to try exporting to it.

Karen emailed the run scripts for collecting the metadata to the list but they did not go through. She has resent them.

Shirley proposed GTC as another code to work with. Shirley is the lead for the GTC Tiger Team and is supposed to be getting two versions of the GTC code from the developers. She will forward them and instructions for running them as soon as she receives them. GTC has a known performance bottleneck in its scatter-gather algorithm so this could be interesting to investigate. GTC is a Joule code and the goal is to have it run 50% faster. The new version, called GTC_s, is a major re-write and has not been optimized. Both versions of the code will run on Jacquard.

Shirley will send out an agenda for the March 13 meeting. The current plan is to interact with the PERI team on the morning of March 13 and have our working group meeting that afternoon, ending no later than 5pm.

Dan could not attend but has posted the most recent XML schema and an example of the minimal data that need to be collected in order for the search/browse interface to work. Please take a look and send any comments to Dan and cc the list.

Action items:

- send Yeen your information for the meeting hotel if you have not already done so
- look at the minimal data set that Dan posted and send Dan any comments and cc the list
- try using Karen's scripts on Jacquard (TAU group, Valerie)
- send Valerie notes on how to build the relevant portions of MILC (Kevin)
- send both versions of GTC and run instructions to the list(Shirley)
- send out agenda for the March 13 meeting (Shirley)

Next conference call will be 4pm Eastern on Tue 27 Feb 2007.

January 23, 2007

Conference Call notes

We discussed tasks and a schedule for initial implementation of metadata collection and a browse/search interface before the March PERI meeting. The idea is to be able to search and browse the metadata to see what performance data has been collected for what codes and determine what performance data one wants to look at and then be directed to the performance database that has that data. Once the user has been directed to a particular performance database, he/she would use that database's interface. We are not attempting to allow the user to simultaneously view or analyze data from different databases in our initial implementation.

We discussed codes to use for the initial implementation. Ying suggested the MILC QCD code. She said Chroma was also a possibility, but since it currently runs on Linux systems, we decided to go with MILC. John suggested QBox, a molecular dynamics code, and will investigate this possibility (John didn't know if QBox was a SciDAC code).

We had previously decided that our initial platforms would be Jacquard (NERSC Linux cluster) and BG/L (RENCI or Argonne).

Allen and Valerie have volunteered to do an implementation of combining metadata collection with their performance database for TAU PerfDMF and Prophesy, respectively.

Action items to be accomplished before Feb 6:

- Karen will post PerfTrack scripts for metadata collection on the wiki (These currently should work on Linux clusters, Karen will investigate extending them to BG/L).
- Ying will email a description of the MILC code and how to obtain and build it.
- Allen and Valerie will get accounts for their teams on Jacquard and a BG/L.
- Dan will start implementing a search/browse web interface for the metadata.
- Dan update the schema on the Wiki to include a section for pointer to the

external DB that has the actual performance data.h

- Dan will send out an example of an XML document that contains the minimum

required information (for discussion on the next call).


Next conference call will be 4pm Eastern on Tue 6 Feb 2007.

January 9, 2007

Conference Call Notes

We discussed changes to the XML schema:

- "error" to be changed to "error_messages"
- "job_status" to be changed to "job_completion_status"
- add timestamps where needed
- add scheduler_parameters

Dan will make the changes and post the new schema.

Valerie asked for clarification on the meaning of scheduler queue contents. Jeff said the idea is to capture the queue contents when the job starts so as to be able to account for interference between jobs.

We discussed how to implement tools to generate the metadata. The idea is to have a master script for an experiment that can call Python scripts that collect the metadata. The scripts would need to interface with the performance measurement system. Valerie suggested getting compiler information from Makefiles. Valerie suggested having macros collect metadata that doesn't change with every run on a daily basis and storing it, rather than collecting it every time.

We would like to aim for a prototype implementation of the metadata collection and search/browse interface for the March PERI meeting. The suggestion was made to focus on a few codes. The QCD project is a possible candidate [note: QCD has since decided to implement their own system for metadata collection].

Next call will be Jan 23, 4pm Eastern.

December 19, 2006

19 December 2006 Conference Call notes

Conference Call notes

Present: Shirley Moore, Dan Gunter, Allen Malony and TAU Team, Valerie Taylor, John May, Karen Karavanic, Ying Zhang

We discussed how to proceed with implementing a performance database for SciDAC applications. We first reviewed the draft run rules document that Shirley had posted to the wiki.

Boyana suggested that we provide tools that applications people can use to obtain the information they are supposed to provide as metadata. She also pointed out that some information may not be available on some platforms.

Karen asked if we can assume a version control system. Boyana said she is working with two codes that do not use version control. We decided that a version control system should be leveraged if it is available but that other means of identifying the software version would have to be used if not.

Following up on Boyana’s suggestion of tools, Allen emphasized the need for ways to collect the metadata across platforms since this will not be done by the performance database. Karen said 100% automation is needed if applications are to use this. Dan pointed out that IPM collects some of the metadata.

We decided to focus first on implementing the metadata collection on two machines. We decided initially target BG/L (Argonne or RENCI) and the Jacquard Opteron cluster at NERSC. Boyanna can get people Argonne accounts, and Ying can get people RENCI accounts. Shirley will ask David to get non-PERI people accounts on Jacquard.

We decided that XML would be the best format for the metadata schema. Karen will post the PerfTrack XML schema as an example. Shirley (with help from Dan Gunter who volunteered by email following the conference call) will post a draft XML schema for PERI metadata by Jan. 2.

Next conference call will be Tuesday, Jan. 9, at 4pm EST.

December 5, 2006

5 December 2006 Conference Call notes (doc)

Conference call notes

Present: Shirley Moore, Dan Gunter, Allen Malony, Valerie Taylor, John May, Karen Karavanic, Ying Zhang, Pat Worley

We discussed issues involved with collection and storage of and access to performance data for SciDAC applications.

We first reviewed what we knew about applications already using or planning to use performance data collection tools:

· Shirley - FACETS and Chem/CCA using TAU, applications running at NERSC using IPM

· Ying – SvPablo used to collect performance data for MILC, database not yet released, other use of HPCtoolkit

· Pat – fusion and climate projects use portable timing libraries, HPM and Cray hwpc

· Valerie – database using Postgres, performance data plus metadata, scripts to get performance data from other tools incl. SvPablo, GYRO, CAM, high-energy physics LIGO, cosmology ENZO, CFD apps; primarily timing data from Prophesy

· Karen and John – PerfTrack, flexible schema to accommodate different types of performance data, different underlying databases, automated scripts to collect info about environment; support gprof, mpip, application-specific timing, hw data on BlueGene (with Martin Shultz)

This discussion had already branched into discussion of the various performance database tools this group has to offer, so we continued with that.

Allen described TAU’s PerfDMF which supports parallel profiles and works over a backend SQL database.

Allen emphasized the importance of getting context information that is needed to be able to do tracking and analysis later. Pat suggested a set of run rules for doing performance data collection that would capture this information. Dan suggested the PERI wiki as clearing house of metadata about the runs that would point to where the actual data could be obtained.

Valerie brought up the issue mapping of data to analysis type and vice versa.

Valerie asked about access to data for people outside PERI. We expect that everyone in this group will have access, as the idea is to involve the broader performance analysis community with PERI.

We discussed more generally access to the data should be handled, with possibilities ranging from only an application team having access to their own data, just the PERI team and collaborators having access, or public access. Initially, since just the metadata will be available on the PERI wiki, access control will be handled however it is already being done by the individual databases. Although the metadata itself could potentially be sensitive, we decided to defer that issue until if and when it occurs.

Allen started a discussion about interoperability, importers and translators, get path to different analysis tool, data merging. John remarked on the tough technical problem in reconciling different schemas. Karen said we would need to define one common schema (union rather than intersection) and that tools would need to map to it and be able to handle missing data. Although the goal would be many-to-many interoperation via the common schema, we decided implementing pair-wise interoperation would be an easier, more tractable first goal.

Action items:

The four performance database groups (SvPablo, Prophesy, TAU PerfDMF, PerfTrack) will send Shirley email about the context information they currently collect. Shirley will draft an initial run rule document that will be posted on the wiki for discussion and refinement and presented to PERI in the Dec 13 conference call.

We will continue to discuss how to handle performance database interoperation. Initially, the metadata on the PERI wiki will include which tool(s) can be used to access a given dataset. The next step would be to be able to use a desired analysis tool with data pulled from an interoperating database using an importing or translation tool and the common schema.

Next conference call for this effort will be Tuesday, Dec 19, 2006, 4pm Eastern time.

Personal tools
working groups
tiger teams