上一篇《初探Java序列化(Serialization)》给我们大体介绍了什么是序列化和反序列化,以及解析了一下序列化出来的文件。接着我们看看JDK具体如何序列化一个Object。
在序列化过程中,虚拟机会试图调用对象类里的writeObject() 和readObject(),进行用户自定义的序列化和反序列化,如果没有则调用ObjectOutputStream.defaultWriteObject() 和ObjectInputStream.defaultReadObject()。同样,在ObjectOutputStream和ObjectInputStream中最重要的方法也是writeObject()
和 readObject(),递归地写出/读入byte。
所以用户可以通过writeObject()和 readObject()自定义序列化和反序列化逻辑。对一些敏感信息加密的逻辑也可以放在此。【不过此处不会检查serialVersionUID】
对于一个Obj来说,都是先写类信息description,再写属性field。
下面是defaultWriteObject()和defaultReadObject(),详见JDK1.8的ObjectOutputStream和ObjectInputStream。
public void defaultWriteObject() throws IOException { SerialCallbackContext ctx = curContext; if (ctx == null) { throw new NotActiveException("not in call to writeObject"); } Object curObj = ctx.getObj(); ObjectStreamClass curDesc = ctx.getDesc(); bout.setBlockDataMode(false); defaultWriteFields(curObj, curDesc); bout.setBlockDataMode(true); } private void defaultWriteFields(Object obj, ObjectStreamClass desc) { Class<?> cl = desc.forClass(); if (cl != null && obj != null && !cl.isInstance(obj)) throw new ClassCastException(); desc.checkDefaultSerialize(); int primDataSize = desc.getPrimDataSize(); if (primVals == null || primVals.length < primDataSize) primVals = new byte[primDataSize]; desc.getPrimFieldValues(obj, primVals); bout.write(primVals, 0, primDataSize, false); ObjectStreamField[] fields = desc.getFields(false); Object[] objVals = new Object[desc.getNumObjFields()]; int numPrimFields = fields.length - objVals.length; desc.getObjFieldValues(obj, objVals); for (int i = 0; i < objVals.length; i++) { if (extendedDebugInfo) debugInfoStack.push( "field (class \"" + desc.getName() + "\", name: \"" + fields[numPrimFields + i].getName() + "\", type: \"" + fields[numPrimFields + i].getType() + "\")"); writeObject0(objVals[i], fields[numPrimFields + i].isUnshared()); }
从上面的代码能看出来,JAVA在序列化write的过程中,根据field的类型分成
基本类型 和 对象。
解析基本类型 getPrimFieldValues()
JAVA会把对象中field的相应内存地址记录起来,拼装在FieldReflector对象中,然后通过unsafe来读取其中的基本类型的值,并将其转换成最终要写出的byte[]。
【上述操作在ObjectStreamClass中完成】
fieldRefl = getReflector(fields, this); FieldReflector(ObjectStreamField[] fields) { this.fields = fields; int nfields = fields.length; readKeys = new long[nfields]; writeKeys = new long[nfields]; offsets = new int[nfields]; typeCodes = new char[nfields]; ArrayList<Class<?>> typeList = new ArrayList<>(); Set<Long> usedKeys = new HashSet<>(); for (int i = 0; i < nfields; i++) { ObjectStreamField f = fields[i]; Field rf = f.getField(); long key = (rf != null) ? unsafe.objectFieldOffset(rf) : Unsafe.INVALID_FIELD_OFFSET; readKeys[i] = key; writeKeys[i] = usedKeys.add(key) ? key : Unsafe.INVALID_FIELD_OFFSET; offsets[i] = f.getOffset(); typeCodes[i] = f.getTypeCode(); if (!f.isPrimitive()) typeList.add((rf != null) ? rf.getType() : null); } types = typeList.toArray(new Class<?>[typeList.size()]); numPrimFields = nfields - types.length; } void getPrimFieldValues(Object obj, byte[] buf) { if (obj == null) throw new NullPointerException(); for (int i = 0; i < numPrimFields; i++) { long key = readKeys[i]; int off = offsets[i]; switch (typeCodes[i]) { case 'Z': Bits.putBoolean(buf, off, unsafe.getBoolean(obj, key)); break; case 'B': buf[off] = unsafe.getByte(obj, key); break; case 'C': Bits.putChar(buf, off, unsafe.getChar(obj, key)); break; case 'S': Bits.putShort(buf, off, unsafe.getShort(obj, key)); break; case 'I': Bits.putInt(buf, off, unsafe.getInt(obj, key)); break; case 'F': Bits.putFloat(buf, off, unsafe.getFloat(obj, key)); break; case 'J': Bits.putLong(buf, off, unsafe.getLong(obj, key)); break; case 'D': Bits.putDouble(buf, off, unsafe.getDouble(obj, key)); break; default: throw new InternalError(); } } }
【基本类型和byte的相互转换,都是在Bits类中处理的,如果你对其感兴趣,可以好好研究一下JAVA中的基本类型和byte的转换】
另外对于float和double都是先转换成long,再转成byte。
public
static native
int floatToRawIntBits(floatvalue);
public
static native
long doubleToRawLongBits(doublevalue);
解析对象类型 getObjFieldValues()
写对象比基本类型要复杂,JDK先要查看Class是不是已经有序列化的记录,可以直接复用ClassDesc;另外需要判断对象的类型,Null,Handle,Class,Array,String,Enum等。最后在writeOrdinaryObject()中又会调用writeSerialData()和defaultWriteFields()来递归写入基本类型。
详细可以参考ObjectOutputStream.writeObject0()如下。
private void writeObject0(Object obj, boolean unshared) { boolean oldMode = bout.setBlockDataMode(false); depth++; try { // check for replacement object Object orig = obj; Class<?> cl = obj.getClass(); ObjectStreamClass desc; for (;;) { Class<?> repCl; desc = ObjectStreamClass.lookup(cl, true); if (!desc.hasWriteReplaceMethod() || (obj = desc.invokeWriteReplace(obj)) == null || (repCl = obj.getClass()) == cl) break; cl = repCl; } if (enableReplace) { Object rep = replaceObject(obj); if (rep != obj && rep != null) { cl = rep.getClass(); desc = ObjectStreamClass.lookup(cl, true); } obj = rep; } // if object replaced, run through original checks a second time if (obj != orig) { subs.assign(orig, obj); if (obj == null) { writeNull(); return; } else if (!unshared && (h = handles.lookup(obj)) != -1) { writeHandle(h); return; } else if (obj instanceof Class) { writeClass((Class) obj, unshared); return; } else if (obj instanceof ObjectStreamClass) { writeClassDesc((ObjectStreamClass) obj, unshared); return; } } // remaining cases if (obj instanceof String) { writeString((String) obj, unshared); } else if (cl.isArray()) { writeArray(obj, desc, unshared); } else if (obj instanceof Enum) { writeEnum((Enum<?>) obj, desc, unshared); } else if (obj instanceof Serializable) { writeOrdinaryObject(obj, desc, unshared); } else throw new NotSerializableException(cl.getName()); } finally { depth--; bout.setBlockDataMode(oldMode); } }
另外,我看了一下典型的String序列化策略。第一个字符作为String的类型标识,有普通String和长String,然后接着标记长度,最后才是用UTF写内容。
private void writeString(String str, boolean unshared) throws IOException { handles.assign(unshared ? null : str); long utflen = bout.getUTFLength(str); if (utflen <= 0xFFFF) { bout.writeByte(TC_STRING); bout.writeUTF(str, utflen); } else { bout.writeByte(TC_LONGSTRING); bout.writeLongUTF(str, utflen); } } void writeUTF(String s, long utflen) throws IOException { if (utflen > 0xFFFFL) throw new UTFDataFormatException(); writeShort((int) utflen); writeBytes(s); }