|
Blogs
The first part of this weblog did not quite manage to open a short dump as of Release NW04s 7.0 EHP1 for display. Instead it reviewed ways to extract contextual information from the short dump lists and elsewhere. In this second part of the web log we, in the words of W. C. Fields, grab the bull by the tail and face the issue. In a short dump, you want to answer these primary questions:
We will look at the diagnostic information and aids that the short dump offers for answering these questions. From the Top: The Context of the ErrorYou have finally made it into one of the ABAP short dumps. You'll see a display that looks quite similar to this one.
The Short Dump HeadingOn a bold red background at the top of every dump you will find the short dump ID and the date and time at which the dump occurred. Together with the Application Server and the WP Index from the dump list, you have all the information that you need to look for relevant messages in the ABAP System Log or Developer's Trace (see Part I of this weblog). If an exception occurred and the runtime error was cacheable, the exception that caused the dump is also shown. Together with the program name (ZSP_COMPLETE_FQDNS) from What happened?, you also already have enough information to search for a relevant OSS note. The combination of dump ID or exception name and program name should find the right note, if it exists. You'll also find a more extensive list of search terms for the OSS in the How to correct the error section in the dump. The System Environment: Context Information and Where Did That RFC Actually Come From?You probably skip over the context information presented under System Environment. But there are some worthwhile nuggets of information in there.
If you are analyzing the problem yourself, then here are three important bits of information:
What Happened Exactly: Short Text, What Happened, Error AnalysisThe key question is: what happened exactly? You need to understand the problem in detail to be able to correct it. For this understanding, the Short text, What happened, and Error analysis sections are invaluable. The Short text states what happened in a single line. In our MOVE_TO_LIT_NOTALLOWED_NODATA dump (above), the short text is this:
The What happened section of this dump adds the name of the program in which the error occurred.
In many short dumps with few possible causes, the What happened section describes the error that occurred quite exactly. The MOVE_TO_LIT_NOTALLOWED_NODATA dump, however, can arise out of many different circumstances. It's not possible to say which transgression in the code produced the short dump, so detailed explanations are forced from What happened into the next very useful section, the Error Analysis. Short dump texts are written by SAP kernel developers. The Error analysis sections often provide really detailed information about the possible causes of a short dump, which in turn reflects the detailed knowledge of the kernel that these developers have. There is often a lot of text, but the time taken to read it through will be rewarded.
If you don't see where I try to overwrite a read-only field, then see the seventh point in the discussion in Error analysis, the one that begins "Accesses using field symbols..." Experience has shown that a lot of people just skip over the explanations in What happened and Error analysis. This may end up costing them more time than it saves. Where Did It Happen: Source Code ExtractThe SAP Short Dump developers were right to put Source Code Extract in initial caps, because, if you are lucky, this is a really nice, helpful section of the dump. You're shown exactly where the program was aborted. A few people don't know that from here, you can jump right into source code in the ABAP Editor with a double-click. In the dump that we have been following, it would be possible in the editor to branch to the definition of the internal table LT_CSMNSDIC, where you might notice that the CDNSNAME field has been declared as part of the key of the sorted internal table...
If the code line shown by the pointer doesn't seem to make any sense in the context of the dump, then take a look at the previous line of code. Occasionally, the instruction counter may still advance even after a dump has been triggered, so that the >>>>>> pointer points at the line following the bad line of code. Where Did It Happen: Active Calls/EventsIn program failures that involve infrastructure like Web Dynpro, or calls between components, or in which an uncaught exception has been passed up through the callers, the Active Calls/Events section may help you to understand the components involved in the crash. This call stack is a useful supplement to the point of failure marked in the Source Code Extract, because in the stack you can see how you got to the point of failure. You read the Active Calls/Events list from the bottom up. It shows all of the report events, dynpro modules, functions, methods and form routines through which the path of execution has come. You can jump into the ABAP Editor at any level in the call stack. This means that you can set breakpoints all along the way to the dump if you think that a problem at a higher level resulted in the dump at the end of the stack. There are two things to remember about the ABAP call stack:
Where Did It Happen: The Hard WayUsually, the Source Code Extract shows where your error occurred. But if you are unlucky, you may have to determine this vital piece of information the hard way. As a not so tragic example, if a short dump occurs in a macro, then the source code pointer will be set to the macro call, not to the statement in the macro that caused the problem. An error in the kernel may leave no information in the Source Code Extract at all. In cases like these, how can you find out where the short dump occurred? Let's start with the no-source-code-its-a-macro case. The Source Code Extract does show where the misbehaving macro was called. Since you can jump into the ABAP Editor and then forward-navigate into the macro with a couple of clicks, you can first see if a good look at the macro code might reveal the problem. If you still can't see where in the source code the problem occurred, then the ABAP Control Blocks (CONT) section may help you to localize the problem. The CONT table shows the CCBs - Control-Control-Blocks - which represent the ABAP statements to be executed in the processing blocks of an ABAP program. The short dump contains an extract of the CONT table showing the CCBs that lead up to the dump and the next few statements that were to be processed. Read the list of CCBs from the top down. Low-level as it is, the CONT does not care whether statements are in a macro or not - and it shows the short dump pointer that you know from the Source Code Extract. Unfortunately, a double-click on the CCB at the dump pointer still takes you only to point in the source code at which the bad macro was called. But the halfway intelligible CCB names may be enough to show you at which line of code in the macro the problem occurred. First of all, if the macro is not too long, then clicking on the CCBs to jump into the ABAP Editor shows you where the macro started. Then, with a little jumping back and forth between the CONT table and the ABAP Editor, you can start to equate the CCBs and the statements in the faulty code. In our case, the SQLS and PAR1 CCBs turn out to reference an SQL SELECT well before the macro call. CCB 68, BRAF, represents the start of an IF control structure in which the macro is called. The COND and PAR1 CCBs depict the macro statement that actually failed: CONCATENATE &1 ‘.sap.corp' into &1.
Other situations with no where-it-happened location: Should you not have any luck in finding out exactly where the program went down the tubes, then a useful tip is to try to reproduce the problem in transaction SAT, the ABAP Runtime Analysis. In SAT, you can trace the execution of an ABAP program at the level of ABAP processing blocks. Run your program to its dump (provided that this does not take too long - a non-aggregated SAT trace can get large quickly). Then check the SAT trace. It may help you find out pretty exactly where to look for the problem, even if the dump occurred in a macro. Also, you can use ST05, the Performance Analysis, to switch on (in a controlled fashion - for your user, for example) a detailed trace of program activity. Be aware that the trace will also include the writing of the short dump. The dump processing starts where you find activity on DB table SNAP, so search for the problem area before that point. See help.sap.com for help with using SAT and ST05. The Third Major Question: What's the Solution?Naturally, the discussion that you will find in the How to correct the error section of a short dump tends to be a bit generic. Developers are constantly finding new and inventive ways to repeat old errors, like the MOVE_TO_LIT_NOTALLOWED_NODATA error that we have been examining. It's therefore not possible for How to correct the error to describe exactly what you should do to fix a dumping program. Even so, the combination of the discussion in How to correct the error and taking a good look at the faulty code often leads to success in correcting the problem. In the case of the MOVE_TO_LIT_NOTALLOWED_WA dump that we have been examining, the dump astutely remarks that ‘The field to be overwritten is a parameter or a field symbol.' If you were not aware that the sort keys of a sorted table may not be overwritten in a field symbol, then the tip that a field symbol may be involved might help you get onto the right analytical track. In the end, however, understanding and correcting the cause of a short dump rests on your shoulders. You will have to extract as much information from the short dump as possible, and use this information to illuminate what went wrong in the code. Gathering More InformationA short dump addresses more or less directly the journalistic questions of what went wrong where and what to do. Should these questions be addressed ‘less' rather than ‘more' in a dump, then it is good to know that a dump also includes a lot of additional supporting information that can help you in your analysis. System VariablesAs an ABAP program executes, it is accompanied by an entire swarm of system variables, like Jupiter with its cloud of little moons. Some of these variables are well-known, like SY-SUBRC, the return code set by many ABAP instructions or SY-TABIX, the counter in LOOP AT and READ TABLE internal table instructions. When a short dump occurs, ABAP preserves the state of the system variables at the time of the crash. You can see the contents of these variables in the Contents of system fields section. Here are some of the system variables that are most likely to be useful:
Program VariablesFor the Chosen variables section, the short dump infrastructure takes a quick run through the collapsing program context grabbing any program and infrastructure variables it finds that are currently in scope. The situation is a bit like the belated shopper running through a grocery just at closing time - there's no guarantee that the shopper will bring home everything that he or she was supposed to buy. Even though the dump infrastructure may not capture everything, much more often than not you will find the variables and values that you want to see. Since SAP_BASIS Release 6.20, the short dump infrastructure has captured a separate set of Chosen variables for each level in the Active Events/Calls ABAP call stack.
Chosen variables shows the size (here, one record with a length of 3440 bytes) of an internal table, as well as useful information such as the type of organization of the table (here, a sorted table). The table display can be useful in analyzing the popular dump of type TSV_TNEW_PAGE_ALLOC_FAILED (no more memory available for an internal table), since you can see how much memory has been allocated to hold the rows of each internal table. (The amount of storage allocated for the rows may not, however, be the amount of storage used by the rows of the table. If, for example, a table holds only data references to objects, then storage for references may not be all the memory actually consumed by the table and its contents. The references are relatively short. The objects may occupy much larger amounts of memory.)
In an upcoming release, the table display will contain at least the start of the contents of each of the first five records of each internal table that is captured. Finally, object references that have not been initialized (a favorite cause of OBJECTS_OBJREF_NOT_ASSIGNED_NO, and others...) are easy to pick out in Chosen variables. Just use Ctrl - F to search for ‘:initial}'.
Note that a random mouse click in the Chosen variables display switches the display from the relatively attractive formatted view to an unformatted view. Don't be alarmed. Just click on F3 / Back to return to the formatted display. An Ounce of Prevention...Is worth a pound of cure, as the old saying goes. Don't forget that ABAP offers logging and checkpoints that can be activated when needed (see help.sap.com). With these, you can turn on switchable logging, breakpoints, and assertions to help you with diagnosis and trouble-shooting, should something go wrong in your program after it has reached your users. And don't forget the suite of tools that the ABAP Workbench offers to help you find errors before your users do, starting with tools for static checking like the Code Inspector (Transaction SCI), continuing with the ABAP Unit Test facility, with which you can even go so far as to practice test-driver development. The best ABAP short dump is the one that you never have to analyze.
- This weblog is based in part on Boris Gebhardt's Advanced ABAP Workshop: ABAP Analysis Tools. You can find more information on ABAP Test and Analysis Tools at help.sap.com and also in ABAP: Advanced Tools and Techniques, Volume 2, SAP Press 2009, ISBN 978-3-8362-1151-2. Stephen Pfeiffer
| |||||||||||||||||||||||