dis88.1 6.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152
  1. .TH dis88 1 LOCAL
  2. .SH "NAME"
  3. dis88 \- 8088 symbolic disassembler
  4. .SH "SYNOPSIS"
  5. \fBdis88\fP [ -f -o ] ifile [ ofile ]
  6. .SH "DESCRIPTION"
  7. Dis88 reads ifile, which must be in PC/IX a.out format.
  8. It interprets the binary opcodes and data locations, and
  9. writes corresponding assembler source code to stdout, or
  10. to ofile if specified. The program's output is in the
  11. format of, and fully compatible with, the PC/IX assembler,
  12. as(1). If a symbol table is present in ifile, labels and
  13. references will be symbolic in the output. If the input
  14. file lacks a symbol table, the fact will be noted, and the
  15. disassembly will proceed, with the disassembler generating
  16. synthetic labels as needed. If the input file has split
  17. I/D space, or if it is executable, the disassembler will
  18. make all necessary adjustments in address-reference calculations.
  19. .PP
  20. If the "-o" option appears, object code will be included
  21. in comments during disassembly of the text segment. This
  22. feature is used primarily for debugging the disassembler
  23. itself, but may provide information of passing interest
  24. to users.
  25. .PP
  26. If the "-f" option appears, dis88 will attempt to disassemble
  27. any file whatsoever. It has to assume that the file begins
  28. at address zero.
  29. .PP
  30. The program always outputs the current machine address
  31. before disassembling an opcode. If a symbol table is
  32. present, this address is output as an assembler comment;
  33. otherwise, it is incorporated into the synthetic label
  34. which is generated internally. Since relative jumps,
  35. especially short ones, may target unlabelled locations,
  36. the program always outputs the physical target address
  37. as a comment, to assist the user in following the code.
  38. .PP
  39. The text segment of an object file is always padded to
  40. an even machine address. In addition, if the file has
  41. split I/D space, the text segment will be padded to a
  42. paragraph boundary (i.e., an address divisible by 16).
  43. As a result of this padding, the disassembler may produce
  44. a few spurious, but harmless, instructions at the
  45. end of the text segment.
  46. .PP
  47. Disassembly of the data segment is a difficult matter.
  48. The information to which initialized data refers cannot
  49. be inferred from context, except in the special case
  50. of an external data or address reference, which will be
  51. reflected in the relocation table. Internal data and
  52. address references will already be resolved in the object file,
  53. and cannot be recreated. Therefore, the data
  54. segment is disassembled as a byte stream, with long
  55. stretches of null data represented by an appropriate
  56. ".zerow" pseudo-op. This limitation notwithstanding,
  57. labels (as opposed to symbolic references) are always
  58. output at appropriate points within the data segment.
  59. .PP
  60. If disassembly of the data segment is difficult, disassembly of the
  61. bss segment is quite easy, because uninitialized data is all
  62. zero by definition. No data
  63. is output in the bss segment, but symbolic labels are
  64. output as appropriate.
  65. .PP
  66. For each opcode which takes an operand, a particular
  67. symbol type (text, data, or bss) is appropriate. This
  68. tidy correspondence is complicated somewhat, however,
  69. by the existence of assembler symbolic constants and
  70. segment override opcodes. Therefore, the disassembler's
  71. symbol lookup routine attempts to apply a certain amount
  72. of intelligence when it is asked to find a symbol. If
  73. it cannot match on a symbol of the preferred type, it
  74. may return a symbol of some other type, depending on
  75. preassigned (and somewhat arbitrary) rankings within
  76. each type. Finally, if all else fails, it returns a
  77. string containing the address sought as a hex constant;
  78. this behavior allows calling routines to use the output
  79. of the lookup function regardless of the success of its
  80. search.
  81. .PP
  82. It is worth noting, at this point, that the symbol lookup
  83. routine operates linearly, and has not been optimized in
  84. any way. Execution time is thus likely to increase
  85. geometrically with input file size. The disassembler is
  86. internally limited to 1500 symbol table entries and 1500
  87. relocation table entries; while these limits are generous
  88. (/unix, itself, has fewer than 800 symbols), they are not
  89. guaranteed to be adequate in all cases. If the symbol
  90. table or the relocation table overflows, the disassembly
  91. aborts.
  92. .PP
  93. Finally, users should be aware of a bug in the assembler,
  94. which causes it not to parse the "esc" mnemonic, even
  95. though "esc" is a completely legitimate opcode which is
  96. documented in all the Intel literature. To accommodate
  97. this deficiency, the disassembler translates opcodes of
  98. the "esc" family to .byte directives, but notes the
  99. correct mnemonic in a comment for reference.
  100. .PP
  101. In all cases, it should be possible to submit the output
  102. of the disassembler program to the assembler, and assemble
  103. it without error. In most cases, the resulting object
  104. code will be identical to the original; in any event, it
  105. will be functionally equivalent.
  106. .SH "SEE ALSO"
  107. adb(1), as(1), cc(1), ld(1).
  108. .br
  109. "Assembler Reference Manual" in the PC/IX Programmer's
  110. Guide.
  111. .SH "DIAGNOSTICS"
  112. "can't access input file" if the input file cannot be
  113. found, opened, or read.
  114. .sp
  115. "can't open output file" if the output file cannot be
  116. created.
  117. .sp
  118. "warning: host/cpu clash" if the program is run on a
  119. machine with a different CPU.
  120. .sp
  121. "input file not in object format" if the magic number
  122. does not correspond to that of a PC/IX object file.
  123. .sp
  124. "not an 8086/8088 object file" if the CPU ID of the
  125. file header is incorrect.
  126. .sp
  127. "reloc table overflow" if there are more than 1500
  128. entries in the relocation table.
  129. .sp
  130. "symbol table overflow" if there are more than 1500
  131. entries in the symbol table.
  132. .sp
  133. "lseek error" if the input file is corrupted (should
  134. never happen).
  135. .sp
  136. "warning: no symbols" if the symbol table is missing.
  137. .sp
  138. "can't reopen input file" if the input file is removed
  139. or altered during program execution (should never happen).
  140. .SH "BUGS"
  141. Numeric co-processor (i.e., 8087) mnemonics are not currently supported.
  142. Instructions for the co-processor are
  143. disassembled as CPU escape sequences, or as interrupts,
  144. depending on how they were assembled in the first place.
  145. .sp
  146. Despite the program's best efforts, a symbol retrieved
  147. from the symbol table may sometimes be different from
  148. the symbol used in the original assembly.
  149. .sp
  150. The disassembler's internal tables are of fixed size,
  151. and the program aborts if they overflow.