java.util
),可以显着提高性能并减少java.util.UUID
的无用分配。在让我自己接受JDK维护者的判断之前,请帮助我发现错误和愚蠢! :-) benchmark instances Bytes ns linear runtime
JdkUuidFromString 51.00 1544.0 608.2 ==============================
NessUuidFromString 2.00 72.0 179.1 ========
JdkUuidToString 31.00 1720.0 321.4 ===============
NessUuidToString 3.00 200.0 51.5 ==
FromString获得3倍的加速和对象分配的1/25。 ToString的速度提高了6倍,对象分配提高了1/10。
代码如下:
/**
* A class that provides an alternate implementation of {@link
* UUID#fromString(String)} and {@link UUID#toString()}.
*
* <p> The version in the JDK uses {@link String#split(String)}
* which does not compile the regular expression that is used for splitting
* the UUID string and results in the allocation of multiple strings in a
* string array. We decided to write {@link NessUUID} when we ran into
* performance issues with the garbage produced by the JDK class.
*
*/
public class NessUUID {
private NessUUID() {}
private static final int NUM_ALPHA_DIFF = 'A' - '9' - 1;
private static final int LOWER_UPPER_DIFF = 'a' - 'A';
// FROM STRING
public static UUID fromString(String str) {
int dashCount = 4;
int [] dashPos = new int [6];
dashPos[0] = -1;
dashPos[5] = str.length();
for (int i = str.length()-1; i >= 0; i--) {
if (str.charAt(i) == '-') {
if (dashCount == 0) {
throw new IllegalArgumentException("Invalid UUID string: " + str);
}
dashPos[dashCount--] = i;
}
}
if (dashCount > 0) {
throw new IllegalArgumentException("Invalid UUID string: " + str);
}
long mostSigBits = decode(str, dashPos, 0) & 0xffffffffL;
mostSigBits <<= 16;
mostSigBits |= (decode(str, dashPos, 1) & 0xffffL);
mostSigBits <<= 16;
mostSigBits |= (decode(str, dashPos, 2) & 0xffffL);
long leastSigBits = (decode(str, dashPos, 3) & 0xffffL);
leastSigBits <<= 48;
leastSigBits |= (decode(str, dashPos, 4) & 0xffffffffffffL);
return new UUID(mostSigBits, leastSigBits);
}
private static long decode(final String str, final int [] dashPos, final int field) {
int start = dashPos[field]+1;
int end = dashPos[field+1];
if (start >= end) {
throw new IllegalArgumentException("Invalid UUID string: " + str);
}
long curr = 0;
for (int i = start; i < end; i++) {
int x = getNibbleFromChar(str.charAt(i));
curr <<= 4;
if (curr < 0) {
throw new NumberFormatException("long overflow");
}
curr |= x;
}
return curr;
}
static int getNibbleFromChar(final char c)
{
int x = c - '0';
if (x > 9) {
x -= NUM_ALPHA_DIFF; // difference between '9' and 'A'
if (x > 15) {
x -= LOWER_UPPER_DIFF; // difference between 'a' and 'A'
}
if (x < 10) {
throw new IllegalArgumentException(c + " is not a valid character for an UUID string");
}
}
if (x < 0 || x > 15) {
throw new IllegalArgumentException(c + " is not a valid character for an UUID string");
}
return x;
}
// TO STRING
public static String toString(UUID uuid)
{
return toString(uuid.getMostSignificantBits(), uuid.getLeastSignificantBits());
}
/** Roughly patterned (read: stolen) from java.util.UUID and java.lang.Long. */
public static String toString(long msb, long lsb)
{
char[] uuidChars = new char[36];
digits(uuidChars, 0, 8, msb >> 32);
uuidChars[8] = '-';
digits(uuidChars, 9, 4, msb >> 16);
uuidChars[13] = '-';
digits(uuidChars, 14, 4, msb);
uuidChars[18] = '-';
digits(uuidChars, 19, 4, lsb >> 48);
uuidChars[23] = '-';
digits(uuidChars, 24, 12, lsb);
return new String(uuidChars);
}
private static void digits(char[] dest, int offset, int digits, long val) {
long hi = 1L << (digits * 4);
toUnsignedString(dest, offset, digits, hi | (val & (hi - 1)), 4);
}
private final static char[] DIGITS = {
'0' , '1' , '2' , '3' , '4' , '5' ,
'6' , '7' , '8' , '9' , 'a' , 'b' ,
'c' , 'd' , 'e' , 'f' , 'g' , 'h' ,
'i' , 'j' , 'k' , 'l' , 'm' , 'n' ,
'o' , 'p' , 'q' , 'r' , 's' , 't' ,
'u' , 'v' , 'w' , 'x' , 'y' , 'z'
};
private static void toUnsignedString(char[] dest, int offset, int len, long i, int shift) {
int charPos = len;
int radix = 1 << shift;
long mask = radix - 1;
do {
dest[offset + --charPos] = DIGITS[(int)(i & mask)];
i >>>= shift;
} while (i != 0 && charPos > 0);
}
}
#1 楼
我无法使Caliper运行,但破解了自己的测试:我的最初结果是:
warmup
JdkFrom: 1787.38 JdkTo: 635.12 NessFrom: 460.15 NessTo: 183.67 [-4552303853801426784, 69220000, -4552303853801426784, 69220000]
Real Run
JdkFrom: 1415.68 JdkTo: 553.28 NessFrom: 426.29 NessTo: 94.69 [-4552303853801426784, 69220000, -4552303853801426784, 69220000]
JdkFrom: 1394.24 JdkTo: 387.14 NessFrom: 340.78 NessTo: 59.33 [-4552303853801426784, 69220000, -4552303853801426784, 69220000]
JdkFrom: 1378.38 JdkTo: 339.20 NessFrom: 325.73 NessTo: 59.20 [-4552303853801426784, 69220000, -4552303853801426784, 69220000]
JdkFrom: 1381.61 JdkTo: 334.28 NessFrom: 389.30 NessTo: 59.09 [-4552303853801426784, 69220000, -4552303853801426784, 69220000]
所以,从表面上看,是的,您的算法就快得多了。
对于代码审查,我有一些评论:
fromString()
我不喜欢这样,您会忽略UUID的必需格式,实际上,您说它有4个破折号就很酷,但是,实际上,破折号之间的位数非常大,而您忽略了它。
我觉得您应该在验证和计算破折号的同时计算长位。此后重复循环似乎是多余的。
如果您正在寻找原始性能,我发现的一个窍门是查找表有很大的不同...我将举一个例子。
toString()
我不喜欢公共
toString(long,long)
方法。这不是“对称的”。仅toString(UUID)应该是公开的。DIGITS
代码似乎旨在满足许多不同的基数(基数是复数)。这使这种特殊情况显得有些笨重。方法调用的层次太多。可能要浅得多。
考虑:
我对此有一个破解,因此决定可以做得更好....考虑以下结果:
warmup
JdkFrom: 1929.14 JdkTo: 542.10 NessFrom: 270.43 NessTo: 175.71 [2254274162472357232, 70459000, 2254274162472357232, 70459000]
Real Run
JdkFrom: 1569.85 JdkTo: 404.93 NessFrom: 249.37 NessTo: 45.94 [2254274162472357232, 70459000, 2254274162472357232, 70459000]
JdkFrom: 1528.79 JdkTo: 279.55 NessFrom: 114.74 NessTo: 44.71 [2254274162472357232, 70459000, 2254274162472357232, 70459000]
JdkFrom: 1657.85 JdkTo: 271.24 NessFrom: 118.20 NessTo: 44.43 [2254274162472357232, 70459000, 2254274162472357232, 70459000]
JdkFrom: 1563.52 JdkTo: 273.69 NessFrom: 140.96 NessTo: 46.46 [2254274162472357232, 70459000, 2254274162472357232, 70459000]
这比fromString的版本快三倍,比toString的速度快0.2倍。
这里是(以我的经验)与Java一样快的代码:
package uuid;
import java.util.Arrays;
import java.util.UUID;
/**
* A class that provides an alternate implementation of {@link
* UUID#fromString(String)} and {@link UUID#toString()}.
*
* <p> The version in the JDK uses {@link String#split(String)}
* which does not compile the regular expression that is used for splitting
* the UUID string and results in the allocation of multiple strings in a
* string array. We decided to write {@link NessUUID} when we ran into
* performance issues with the garbage produced by the JDK class.
*
*/
public class NessUUID {
private NessUUID() {}
// lookup is an array indexed by the **char**, and it has
// valid values set with the decimal value of the hex char.
private static final long[] lookup = buildLookup();
private static final int DASH = -1;
private static final int ERROR = -2;
private static final long[] buildLookup() {
long [] lu = new long[128];
Arrays.fill(lu, ERROR);
lu['0'] = 0;
lu['1'] = 1;
lu['2'] = 2;
lu['3'] = 3;
lu['4'] = 4;
lu['5'] = 5;
lu['6'] = 6;
lu['7'] = 7;
lu['8'] = 8;
lu['9'] = 9;
lu['a'] = 10;
lu['b'] = 11;
lu['c'] = 12;
lu['d'] = 13;
lu['e'] = 14;
lu['f'] = 15;
lu['A'] = 10;
lu['B'] = 11;
lu['C'] = 12;
lu['D'] = 13;
lu['E'] = 14;
lu['F'] = 15;
lu['-'] = DASH;
return lu;
}
// FROM STRING
public static UUID fromString(final String str) {
final int len = str.length();
if (len != 36) {
throw new IllegalArgumentException("Invalid UUID string (expected to be 36 characters long)");
}
final long[] vals = new long[2];
int shift = 60;
int index = 0;
for (int i = 0; i < len; i++) {
final int c = str.charAt(i);
if (c >= lookup.length || lookup[c] == ERROR) {
throw new IllegalArgumentException("Invalid UUID string (unexpected '" + str.charAt(i) + "' at position " + i + " -> " + str + " )");
}
if (lookup[c] == DASH) {
if ((i - 8) % 5 != 0) {
throw new IllegalArgumentException("Invalid UUID string (unexpected '-' at position " + i + " -> " + str + " )");
}
continue;
}
vals[index] |= lookup[c] << shift;
shift -= 4;
if (shift < 0) {
shift = 60;
index++;
}
}
return new UUID(vals[0], vals[1]);
}
// TO STRING
// recode is 2-byte arrays representing the hex representation of every byte value (all 256)
private static final char[][] recode = buildByteBlocks();
private static char[][] buildByteBlocks() {
final char[][] ret = new char[256][];
for (int i = 0; i < ret.length; i++) {
ret[i] = String.format("%02x", i).toCharArray();
}
return ret;
}
public static final String toString(final UUID uuid) {
long msb = uuid.getMostSignificantBits();
long lsb = uuid.getLeastSignificantBits();
char[] uuidChars = new char[36];
int cursor = uuidChars.length;
while (cursor > 24 ) {
cursor -= 2;
System.arraycopy(recode[(int)(lsb & 0xff)], 0, uuidChars, cursor, 2);
lsb >>>= 8;
}
uuidChars[--cursor] = '-';
while (cursor > 19) {
cursor -= 2;
System.arraycopy(recode[(int)(lsb & 0xff)], 0, uuidChars, cursor, 2);
lsb >>>= 8;
}
uuidChars[--cursor] = '-';
while (cursor > 14) {
cursor -= 2;
System.arraycopy(recode[(int)(msb & 0xff)], 0, uuidChars, cursor, 2);
msb >>>= 8;
}
uuidChars[--cursor] = '-';
while (cursor > 9) {
cursor -= 2;
System.arraycopy(recode[(int)(msb & 0xff)], 0, uuidChars, cursor, 2);
msb >>>= 8;
}
uuidChars[--cursor] = '-';
while (cursor > 0) {
cursor -= 2;
System.arraycopy(recode[(int)(msb & 0xff)], 0, uuidChars, cursor, 2);
msb >>>= 8;
}
return new String(uuidChars);
}
}
为了娱乐,这里是我的测试类(我不知道Caliper) :
package uuid;
import java.util.Arrays;
import java.util.UUID;
public class PerformanceComparison
{
private final int N_UUIDS = 1000;
private final UUID[] testUuids = new UUID[N_UUIDS];
private final String[] testStrings = new String[N_UUIDS];
public void setup () {
for (int i = 0; i < N_UUIDS; i++)
{
testUuids[i] = UUID.randomUUID();
testStrings[i] = testUuids[i].toString();
}
}
public static void main(String[] args) {
PerformanceComparison pc = new PerformanceComparison();
final UUID uuidj = UUID.randomUUID();
String valj = uuidj.toString();
String valn = NessUUID.toString(uuidj);
UUID uuidn = NessUUID.fromString(valn);
if (!valj.equals(valn)) {
throw new IllegalStateException("Illegal conversion");
}
if (!uuidj.equals(uuidn)) {
throw new IllegalStateException("Illegal conversion");
}
pc.setup();
final int reps = 1000000;
System.out.println(" warmup");
pc.runAll(reps);
System.out.println(" Real Run");
pc.runAll(reps);
pc.runAll(reps);
pc.runAll(reps);
pc.runAll(reps);
}
private final void runAll(final int reps) {
long[] accum = new long[4];
System.out.printf(" JdkFrom: %6.2f JdkTo: %6.2f NessFrom: %6.2f NessTo: %6.2f %s\n",
timeJdkUuidFromString(reps, accum, 0) / 1000000.0,
timeJdkUuidToString(reps, accum, 1) / 1000000.0,
timeNessUuidFromString(reps, accum, 2) / 1000000.0,
timeNessUuidToString(reps, accum, 3) / 1000000.0,
Arrays.toString(accum));
}
public long timeJdkUuidFromString(int reps, long[] accum2, int j)
{
long accum = 0;
long start = System.nanoTime();
for (int i = 0; i < reps; i++)
{
accum += UUID.fromString(testStrings[i % N_UUIDS]).getMostSignificantBits();
}
accum2[j] = accum;
return System.nanoTime() - start;
}
public long timeJdkUuidToString(int reps, long[] accum2, int j)
{
long accum = 0;
long start = System.nanoTime();
for (int i = 0; i < reps; i++)
{
accum += testUuids[i % N_UUIDS].toString().charAt(0);
}
accum2[j] = accum;
return System.nanoTime() - start;
}
public long timeNessUuidFromString(int reps, long[] accum2, int j)
{
long accum = 0;
long start = System.nanoTime();
for (int i = 0; i < reps; i++)
{
accum += NessUUID.fromString(testStrings[i % N_UUIDS]).getMostSignificantBits();
}
accum2[j] = accum;
return System.nanoTime() - start;
}
public long timeNessUuidToString(int reps, long[] accum2, int j)
{
long accum = 0;
long start = System.nanoTime();
for (int i = 0; i < reps; i++)
{
accum += NessUUID.toString(testUuids[i % N_UUIDS]).charAt(0);
}
accum2[j] = accum;
return System.nanoTime() - start;
}
}
评论
在Java 7中,String.split()对不包含元字符的单个字符模式进行了“快速路径”改进。在这种情况下,在“-”上分割将采用快速路径。此快速路径避免使用正则表达式引擎,并提高了琐碎文件的性能。不过,很高兴看到这些改进。toString(long msb,long lsb)应该是私有的。