stackoverflow April 23, 2026 Rep: 1,294

assembly x86_64 program to uppercase via lut gives strange artefacts

Score

Answers

Views

18.5

Trend Score

Question Details

No question body available.

Answers (2)

Accepted Answer Available

Accepted Answer

April 23, 2026 Score: 2 Rep: 63,539 Quality: High Completeness: 100%

Here are the bugs I noticed:

In LutInit, you write each character using mov [upperlut+r10], r10. In NASM and most other x86 assembly languages, an instruction with both a memory and a register operand infers the operand size from the register size. Since r10 is a 64-bit register, this is a 64-bit store; it stores 8 bytes, where you only wanted 1. So it should be mov [upperlut+r10], r10b, using the 8-bit register r10b which is the low byte of r10. There are two other similar instances in LutInit.

(You can add the byte keyword if you wish, as you have done elsewhere, but it's redundant and does not change the behavior. However, it will give you an assembler error if the register size doesn't match, as in mov byte [upperlut+r10], r10.)

This issue will first of all cause you to write 7 bytes beyond the bounds of your lookup table, though that may not cause a visible problem if that memory is mapped and doesn't happen to contain any important data. But it also means that, in particular, when you write the Z (character 90) entry of your lookup table, you write zeros to bytes 91-97, and character 97 is a. So the a entry ends up containing a zero byte, and so all instances of a will be translated to character 0, which displays on most terminals as ^@ as you observed. (Most low ASCII values are displayed with a ^ and adding 32 to the byte value, which mostly result in control sequences like ^A, ^B, etc, but for character 0, you get character 32 which is @.)
You're missing a ret after the .END label, so after completing LutInit, you fall through into OpenFile, causing an open system call with garbage arguments. (The system call fails, and so you jump to Exit which should terminate your program before it really does anything, but that doesn't happen because of another bug, below.)
In OpenFile, you have cmp rax, 0 / jle Exit to test for open system call failure. However, 0 is a valid file descriptor, though it's unlikely to be returned by open, as fd 0 should already be open as your standard input. Nonetheless, the correct test should be cmp rax, 0 / jl Exit.
Your Exit ends with ret. This doesn't make sense, as you don't want Exit to return. Moreover, all your flow transfers to Exit are via jump, not via call, so there will be no useful return address on the stack anyway. You need to invoke an exit system call here, with rax = 60 and rdi the desired return code (0 for successful termination, a small nonzero number for failure).

As hmu535 noted, it would be better to expand your lookup table to 256 bytes, so that every possible byte is correctly handled. This isn't technically needed if you can guarantee that your input file is all pure ASCII (bytes 0-127), but would certainly be essential if your input is not controlled by you.

You probably want to open the output file with OWRONLY instead of ORDWR (since you will not be reading it), and with OTRUNC so that the previous contents are erased if the file already exists. Otherwise, if the output file already exists and is longer than the output of this program, only the first part will be overwritten and the rest of the file will contain its old contents. That would make your flags 01101o if I'm not mistaken.

Some other comments and suggestions for improvement:

Consider using RIP-relative addressing throughout instead of absolute. This is how modern Linux x86-64 programs work, and will allow you to build a position-independent executable, so it would be useful to get accustomed to it as soon as possible. Specifying [default rel] at the top of the file will give you a start. You'll also have to load addresses with lea rdi, [FnameR] instead of mov rdi, FnameR. And you won't be able to use static data addresses with register offsets in addressing modes like [upperlut+r10+r11], so you'll need to do a little more address calculation in registers first.
Values that are constants do not need to be allocated in memory, but are better used as immediate operands. So instead of UPPERA dq 65 and mov r11, [UPPERA] (a load from memory), try UPPERA equ 65 and mov r11, UPPERA (an immediate move).
Prefer to use 32-bit instead of 64-bit operations where possible, as they generally have more compact encodings (most 64-bit instructions require a REX prefix which adds one byte). Take advantage of the fact that all writes to 32-bit registers (eg. eax) will automatically zero the upper 32 bits of the corresponding 64-bit register (eg. rax): Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?
Similarly, prefer to use the low 8 "named" registers eax, ebx, ... where possible, rather than the high 8 "numbered" registers r8, r9, ..., as the high registers also require a REX prefix (even their 32-bit or narrower versions r8d, r8b etc.)
The most efficient and idiomatic way to zero a register on x86-64 is xor ecx, ecx: What is the best way to set a register to zero in x86 assembly: xor, mov or and? Note this actually zeros the entire 64-bit register as noted above, and has a smaller encoding than xor rcx, rcx when the low 8 "named" registers are used. Your code frequently uses mov reg, 0 which is less efficient.
Additionally, there are several places where you zero a register and then immediately overwrite it with another value (in some cases also zero). This is redundant.
Ideally your read system call would check for errors (negative return value in rax) as well as EOF (zero return), and your write system calls would check for errors (negative return value) and short writes (positive return value but less than the requested number of bytes to write).

April 23, 2026 Score: 3 Rep: 338 Quality: Low Completeness: 70%

Expand LUT:

upperlut resb 256

Initialize correctly (don’t use 128, use full 256):

xor r10, r10
.loop:
cmp r10, 256
jge .done
mov byte [upperlut + r10], r10b
inc r10
jmp .loop
.done:

Then only override lowercase:

mov r10, 'a'
.loop2:
cmp r10, 'z'
jg .end
mov r11, r10
sub r11, 32
mov byte [upperlut + r10], r11b
inc r10
jmp .loop2
.end:
ret

this removes out-of-bounds access and fixes the ^@ + truncation issue completely.

Export Question Data

Export this question and its answers for further analysis or reporting.

Back to Questions

assembly x86_64 program to uppercase via lut gives strange artefacts

Question Details

Tags

Answers (2)

Analysis Metrics

Question Information

Actions

Related Questions

Export Question Data