Why do you think 500 ADU is too high a value? What did you use as offset and gain to measure the bias frame? Both will affect the ADU.
To prevent the signal from the (analog) CMOS sensor from going negative (due primarily to read noise and dark current noise), a small constant DC voltage is added to the sensor's analog output. A gain is then applied to that analog signal before it is finally sent to the analog-to-digital converter (the gain helps reducing the quantization noise of the ADC itself, which is of the order of 1 ADU). This offset is an absolute necessity when you later stack the frames to reduce the variance of the read noise, otherwise, read noise would increase instead of falling with stacking.
You would expect the output digital signal (in units of ADU) of a sensor with no signal to be around ( offset + read noise + dark current noise )*gain. The dark current term is small for a bias frame that has very short exposure.
Notice that ZWO uses a gain scale of 0.1 decibel (dB) per gain unit. So a ZWO gain of 200 would correspond to a gain of 20 dB, which corresponds to a voltage gain of 10 (a power gain of 100).
That being said, an 500 ADU offset is very little loss. The full scale (clipping) signal produces 65535 ADU in that particular camera. Dropping 500 ADU means that you still have over 65000 ADU steps left -- i.e., virtually 16 bits of dynamic range (a little over 15.98 bits dynamic range).
I would rather chance losing 0.02 bits of dynamic range than have the read noise drive the ADC input voltage negative. Remember that the read noise in the spec sheets are RMS, not the peak. You would definitely want an offset that is more than 5 sigma of the read noise variance (5 sigma means that on average, 1 pixel out of 3.5 million pixels will still clip the ADC).
Chen