123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152 |
- .TH dis88 1 LOCAL
- .SH "NAME"
- dis88 \- 8088 symbolic disassembler
- .SH "SYNOPSIS"
- \fBdis88\fP [ -f -o ] ifile [ ofile ]
- .SH "DESCRIPTION"
- Dis88 reads ifile, which must be in PC/IX a.out format.
- It interprets the binary opcodes and data locations, and
- writes corresponding assembler source code to stdout, or
- to ofile if specified. The program's output is in the
- format of, and fully compatible with, the PC/IX assembler,
- as(1). If a symbol table is present in ifile, labels and
- references will be symbolic in the output. If the input
- file lacks a symbol table, the fact will be noted, and the
- disassembly will proceed, with the disassembler generating
- synthetic labels as needed. If the input file has split
- I/D space, or if it is executable, the disassembler will
- make all necessary adjustments in address-reference calculations.
- .PP
- If the "-o" option appears, object code will be included
- in comments during disassembly of the text segment. This
- feature is used primarily for debugging the disassembler
- itself, but may provide information of passing interest
- to users.
- .PP
- If the "-f" option appears, dis88 will attempt to disassemble
- any file whatsoever. It has to assume that the file begins
- at address zero.
- .PP
- The program always outputs the current machine address
- before disassembling an opcode. If a symbol table is
- present, this address is output as an assembler comment;
- otherwise, it is incorporated into the synthetic label
- which is generated internally. Since relative jumps,
- especially short ones, may target unlabelled locations,
- the program always outputs the physical target address
- as a comment, to assist the user in following the code.
- .PP
- The text segment of an object file is always padded to
- an even machine address. In addition, if the file has
- split I/D space, the text segment will be padded to a
- paragraph boundary (i.e., an address divisible by 16).
- As a result of this padding, the disassembler may produce
- a few spurious, but harmless, instructions at the
- end of the text segment.
- .PP
- Disassembly of the data segment is a difficult matter.
- The information to which initialized data refers cannot
- be inferred from context, except in the special case
- of an external data or address reference, which will be
- reflected in the relocation table. Internal data and
- address references will already be resolved in the object file,
- and cannot be recreated. Therefore, the data
- segment is disassembled as a byte stream, with long
- stretches of null data represented by an appropriate
- ".zerow" pseudo-op. This limitation notwithstanding,
- labels (as opposed to symbolic references) are always
- output at appropriate points within the data segment.
- .PP
- If disassembly of the data segment is difficult, disassembly of the
- bss segment is quite easy, because uninitialized data is all
- zero by definition. No data
- is output in the bss segment, but symbolic labels are
- output as appropriate.
- .PP
- For each opcode which takes an operand, a particular
- symbol type (text, data, or bss) is appropriate. This
- tidy correspondence is complicated somewhat, however,
- by the existence of assembler symbolic constants and
- segment override opcodes. Therefore, the disassembler's
- symbol lookup routine attempts to apply a certain amount
- of intelligence when it is asked to find a symbol. If
- it cannot match on a symbol of the preferred type, it
- may return a symbol of some other type, depending on
- preassigned (and somewhat arbitrary) rankings within
- each type. Finally, if all else fails, it returns a
- string containing the address sought as a hex constant;
- this behavior allows calling routines to use the output
- of the lookup function regardless of the success of its
- search.
- .PP
- It is worth noting, at this point, that the symbol lookup
- routine operates linearly, and has not been optimized in
- any way. Execution time is thus likely to increase
- geometrically with input file size. The disassembler is
- internally limited to 1500 symbol table entries and 1500
- relocation table entries; while these limits are generous
- (/unix, itself, has fewer than 800 symbols), they are not
- guaranteed to be adequate in all cases. If the symbol
- table or the relocation table overflows, the disassembly
- aborts.
- .PP
- Finally, users should be aware of a bug in the assembler,
- which causes it not to parse the "esc" mnemonic, even
- though "esc" is a completely legitimate opcode which is
- documented in all the Intel literature. To accommodate
- this deficiency, the disassembler translates opcodes of
- the "esc" family to .byte directives, but notes the
- correct mnemonic in a comment for reference.
- .PP
- In all cases, it should be possible to submit the output
- of the disassembler program to the assembler, and assemble
- it without error. In most cases, the resulting object
- code will be identical to the original; in any event, it
- will be functionally equivalent.
- .SH "SEE ALSO"
- adb(1), as(1), cc(1), ld(1).
- .br
- "Assembler Reference Manual" in the PC/IX Programmer's
- Guide.
- .SH "DIAGNOSTICS"
- "can't access input file" if the input file cannot be
- found, opened, or read.
- .sp
- "can't open output file" if the output file cannot be
- created.
- .sp
- "warning: host/cpu clash" if the program is run on a
- machine with a different CPU.
- .sp
- "input file not in object format" if the magic number
- does not correspond to that of a PC/IX object file.
- .sp
- "not an 8086/8088 object file" if the CPU ID of the
- file header is incorrect.
- .sp
- "reloc table overflow" if there are more than 1500
- entries in the relocation table.
- .sp
- "symbol table overflow" if there are more than 1500
- entries in the symbol table.
- .sp
- "lseek error" if the input file is corrupted (should
- never happen).
- .sp
- "warning: no symbols" if the symbol table is missing.
- .sp
- "can't reopen input file" if the input file is removed
- or altered during program execution (should never happen).
- .SH "BUGS"
- Numeric co-processor (i.e., 8087) mnemonics are not currently supported.
- Instructions for the co-processor are
- disassembled as CPU escape sequences, or as interrupts,
- depending on how they were assembled in the first place.
- .sp
- Despite the program's best efforts, a symbol retrieved
- from the symbol table may sometimes be different from
- the symbol used in the original assembly.
- .sp
- The disassembler's internal tables are of fixed size,
- and the program aborts if they overflow.
|