Normalizing the mantissa in floating point representation

Question Detail: 

How to represent $0.148 * 2^{14}$ in normalized floating point arithmetic with the format

1 - Sign bit 7 - Exponent in Excess-64 form 8 - Mantissa 

$(0.148)_{10} = (0.00100101\;111...)_2$

We shift it 3 bits to left to make it normalized $(1.00101\;111)_2 * 2^{11}$.

Exponent = $11+64 = (75)_{10} = (1001011)_2$ and Mantissa = $(01001\;111)_2$.

So floating point representation is $(0\;1001011\;00101111)_2 = (4B2F)_{16}$ Representation A

But if we store the denormalized mantissa into 8 bit register, then it won't have stored the last three $1$s and then the mantissa would have normalized from $(0.00100101)_2$ to $(1.00101\;000)_2$ by inserting 3 $0$s instead of $1$s.

The representation would have been $(0\;1001011\;00101000)_2 = (4B28)_{16}$ Representation B

So while normalizing, does the processor takes into account the denormalized mantissa bits beyond 8 bits too? Or just rounds it off? Which one is correct: A or B?

Does it store the mantissa in fixed point representation? How does it all work?

Asked By : Shashwat
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/7828

Answered By : Vor

The correct binary representation of $0.145$ is $0.00100101000111...$

Normalized: $1.00101000111...$

The 8 bit mantissa is $00101000 = 0\mbox{x}28$

No comments

Powered by Blogger.