这可能并不完全是主题,但是今天天气很慢。

是否有一种更有效的方法来获取1到49的数字列表,并且该列包含单词FIZZ可以均匀除以3,BUZZ可以除以5时,FIZZBUZZ可以除以3和5时?

我的尝试是(注意,这将清空过程高速缓存,因此不要在生产盒上运行):

DBCC FREEPROCCACHE
GO
/*VARIANT1*/
;WITH t AS (
    SELECT RowNum = ROW_NUMBER() OVER (ORDER BY o.object_id)
    FROM sys.objects o
)
SELECT t.RowNum
    , CASE WHEN ((t.RowNum % 3) + (t.RowNum % 5)) = 0  THEN 'FIZZBUZZ' 
    ELSE 
        CASE WHEN t.RowNum % 3 = 0 THEN 'FIZZ' 
        ELSE 
            CASE WHEN t.RowNum % 5 = 0 THEN 'BUZZ' 
            ELSE '' 
            END 
        END 
    END
FROM t
WHERE t.RowNum < 50;
GO 100

/*VARIANT2*/
DECLARE @t TABLE
(
    Num INT NOT NULL PRIMARY KEY CLUSTERED
);
INSERT INTO @t (Num)
SELECT ROW_NUMBER() OVER (ORDER BY o.object_id)
FROM sys.objects o;

SELECT t.Num
    , CASE WHEN ((t.Num % 3) + (t.Num % 5)) = 0  THEN 'FIZZBUZZ' 
    ELSE 
        CASE WHEN t.Num % 3 = 0 THEN 'FIZZ' 
        ELSE 
            CASE WHEN t.Num % 5 = 0 THEN 'BUZZ' 
            ELSE '' 
            END 
        END 
    END
FROM @t t
WHERE t.Num < 50;
GO 100

SELECT CASE WHEN dest.text LIKE '%/*VARIANT1*/%' THEN 'VARIANT1' ELSE 'VARIANT2' END
    , MAX(deqs.execution_count)
    , SUM(deqs.total_worker_time)
    , AvgWorkerTime = SUM(deqs.total_worker_time) / MAX(deqs.execution_count)
FROM sys.dm_exec_query_stats deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) dest
WHERE (dest.text LIKE '%/*VARIANT1*/%'
    OR dest.text LIKE '%/*VARIANT2*/%')
    AND dest.text NOT LIKE '%/*NOT_ME!*/%'
GROUP BY CASE WHEN dest.text LIKE '%/*VARIANT1*/%' THEN 'VARIANT1' ELSE 'VARIANT2' END
ORDER BY CASE WHEN dest.text LIKE '%/*VARIANT1*/%' THEN 'VARIANT1' ELSE 'VARIANT2' END
/*NOT_ME!*/;


在@AaronBertrand的建议下,我已经修改了对每组语句运行100次的尝试,然后通过sys.dm_exec_query_stats显示了SQL Server记录的时间。

结果:

            Runs    total_time      average time
VARIANT1    100     42533           425
VARIANT2    100     138677          1386


#1 楼

下面是一个T-SQL解决方案,它将前一百万个数字写入临时表。我的机器大约需要84毫秒。关键瓶颈正在等待NESTING_TRANSACTION_FULL闩锁和CXPACKET,除了更改MAXDOP之外,我都不知道该如何解决。我想要一个可以利用并行嵌套循环和基于需求的并行性的查询计划,这是我设法得到的:



代码有点长。简而言之,我将246行和271行的两个派生表连接在一起,总共66666行。这些行连接到15行派生表,这利用了FIZZBUZZ模式每15行重复一次这一事实。最后十行添加一个UNION ALL

DROP TABLE IF EXISTS #t;

SELECT res.fizzbuzz INTO #t
FROM
(
VALUES
(0),
(15),
(30),
(45),
(60),
(75),
(90),
(105),
(120),
(135),
(150),
(165),
(180),
(195),
(210),
(225),
(240),
(255),
(270),
(285),
(300),
(315),
(330),
(345),
(360),
(375),
(390),
(405),
(420),
(435),
(450),
(465),
(480),
(495),
(510),
(525),
(540),
(555),
(570),
(585),
(600),
(615),
(630),
(645),
(660),
(675),
(690),
(705),
(720),
(735),
(750),
(765),
(780),
(795),
(810),
(825),
(840),
(855),
(870),
(885),
(900),
(915),
(930),
(945),
(960),
(975),
(990),
(1005),
(1020),
(1035),
(1050),
(1065),
(1080),
(1095),
(1110),
(1125),
(1140),
(1155),
(1170),
(1185),
(1200),
(1215),
(1230),
(1245),
(1260),
(1275),
(1290),
(1305),
(1320),
(1335),
(1350),
(1365),
(1380),
(1395),
(1410),
(1425),
(1440),
(1455),
(1470),
(1485),
(1500),
(1515),
(1530),
(1545),
(1560),
(1575),
(1590),
(1605),
(1620),
(1635),
(1650),
(1665),
(1680),
(1695),
(1710),
(1725),
(1740),
(1755),
(1770),
(1785),
(1800),
(1815),
(1830),
(1845),
(1860),
(1875),
(1890),
(1905),
(1920),
(1935),
(1950),
(1965),
(1980),
(1995),
(2010),
(2025),
(2040),
(2055),
(2070),
(2085),
(2100),
(2115),
(2130),
(2145),
(2160),
(2175),
(2190),
(2205),
(2220),
(2235),
(2250),
(2265),
(2280),
(2295),
(2310),
(2325),
(2340),
(2355),
(2370),
(2385),
(2400),
(2415),
(2430),
(2445),
(2460),
(2475),
(2490),
(2505),
(2520),
(2535),
(2550),
(2565),
(2580),
(2595),
(2610),
(2625),
(2640),
(2655),
(2670),
(2685),
(2700),
(2715),
(2730),
(2745),
(2760),
(2775),
(2790),
(2805),
(2820),
(2835),
(2850),
(2865),
(2880),
(2895),
(2910),
(2925),
(2940),
(2955),
(2970),
(2985),
(3000),
(3015),
(3030),
(3045),
(3060),
(3075),
(3090),
(3105),
(3120),
(3135),
(3150),
(3165),
(3180),
(3195),
(3210),
(3225),
(3240),
(3255),
(3270),
(3285),
(3300),
(3315),
(3330),
(3345),
(3360),
(3375),
(3390),
(3405),
(3420),
(3435),
(3450),
(3465),
(3480),
(3495),
(3510),
(3525),
(3540),
(3555),
(3570),
(3585),
(3600),
(3615),
(3630),
(3645),
(3660),
(3675)
) v246 (n)
CROSS JOIN 
(
VALUES
(0),
(15),
(30),
(45),
(60),
(75),
(90),
(105),
(120),
(135),
(150),
(165),
(180),
(195),
(210),
(225),
(240),
(255),
(270),
(285),
(300),
(315),
(330),
(345),
(360),
(375),
(390),
(405),
(420),
(435),
(450),
(465),
(480),
(495),
(510),
(525),
(540),
(555),
(570),
(585),
(600),
(615),
(630),
(645),
(660),
(675),
(690),
(705),
(720),
(735),
(750),
(765),
(780),
(795),
(810),
(825),
(840),
(855),
(870),
(885),
(900),
(915),
(930),
(945),
(960),
(975),
(990),
(1005),
(1020),
(1035),
(1050),
(1065),
(1080),
(1095),
(1110),
(1125),
(1140),
(1155),
(1170),
(1185),
(1200),
(1215),
(1230),
(1245),
(1260),
(1275),
(1290),
(1305),
(1320),
(1335),
(1350),
(1365),
(1380),
(1395),
(1410),
(1425),
(1440),
(1455),
(1470),
(1485),
(1500),
(1515),
(1530),
(1545),
(1560),
(1575),
(1590),
(1605),
(1620),
(1635),
(1650),
(1665),
(1680),
(1695),
(1710),
(1725),
(1740),
(1755),
(1770),
(1785),
(1800),
(1815),
(1830),
(1845),
(1860),
(1875),
(1890),
(1905),
(1920),
(1935),
(1950),
(1965),
(1980),
(1995),
(2010),
(2025),
(2040),
(2055),
(2070),
(2085),
(2100),
(2115),
(2130),
(2145),
(2160),
(2175),
(2190),
(2205),
(2220),
(2235),
(2250),
(2265),
(2280),
(2295),
(2310),
(2325),
(2340),
(2355),
(2370),
(2385),
(2400),
(2415),
(2430),
(2445),
(2460),
(2475),
(2490),
(2505),
(2520),
(2535),
(2550),
(2565),
(2580),
(2595),
(2610),
(2625),
(2640),
(2655),
(2670),
(2685),
(2700),
(2715),
(2730),
(2745),
(2760),
(2775),
(2790),
(2805),
(2820),
(2835),
(2850),
(2865),
(2880),
(2895),
(2910),
(2925),
(2940),
(2955),
(2970),
(2985),
(3000),
(3015),
(3030),
(3045),
(3060),
(3075),
(3090),
(3105),
(3120),
(3135),
(3150),
(3165),
(3180),
(3195),
(3210),
(3225),
(3240),
(3255),
(3270),
(3285),
(3300),
(3315),
(3330),
(3345),
(3360),
(3375),
(3390),
(3405),
(3420),
(3435),
(3450),
(3465),
(3480),
(3495),
(3510),
(3525),
(3540),
(3555),
(3570),
(3585),
(3600),
(3615),
(3630),
(3645),
(3660),
(3675),
(3690),
(3705),
(3720),
(3735),
(3750),
(3765),
(3780),
(3795),
(3810),
(3825),
(3840),
(3855),
(3870),
(3885),
(3900),
(3915),
(3930),
(3945),
(3960),
(3975),
(3990),
(4005),
(4020),
(4035),
(4050)
) v271 (n)
CROSS APPLY
(
VALUES
(CAST(v246.n * 271 + v271.n + 1 AS CHAR(8))),
(CAST(v246.n * 271 + v271.n + 2 AS CHAR(8))),
(CAST('FIZZ' AS CHAR(8))),
(CAST(v246.n * 271 + v271.n + 4 AS CHAR(8))),
(CAST('BUZZ' AS CHAR(8))),
(CAST('FIZZ' AS CHAR(8))),
(CAST(v246.n * 271 + v271.n + 7 AS CHAR(8))),
(CAST(v246.n * 271 + v271.n + 8 AS CHAR(8))),
(CAST('FIZZ' AS CHAR(8))),
(CAST('BUZZ' AS CHAR(8))),
(CAST(v246.n * 271 + v271.n + 11 AS CHAR(8))),
(CAST('FIZZ' AS CHAR(8))),
(CAST(v246.n * 271 + v271.n + 13 AS CHAR(8))),
(CAST(v246.n * 271 + v271.n + 14 AS CHAR(8))),
(CAST('FIZZBUZZ' AS CHAR(8)))
) res (fizzbuzz)

UNION ALL

SELECT v.fizzbuzz
FROM (
VALUES 
('999991'),
('999992'),
('FIZZ'),
('999994'),
('BUZZ'),
('FIZZ'),
('999997'),
('999998'),
('FIZZ'),
('BUZZ')
) v (fizzbuzz)

OPTION (MAXDOP 6, NO_PERFORMANCE_SPOOL);


评论


该解决方案实际上不进行任何计算,这有点作弊,但这实际上并不是一个明确的要求。该解决方案采用了即开即用的思维方式,这是其卓越的标志。我希望在输出中看到两列,因此可以更容易地将输出确认为正确。

– Max Vernon♦
19年4月4日在15:29

#2 楼

使用SQL Server 2014内存优化表和本机编译过程:

-- Setup
CREATE DATABASE InMem;
GO
ALTER DATABASE InMem
ADD FILEGROUP FG1
CONTAINS MEMORY_OPTIMIZED_DATA;
GO
ALTER DATABASE InMem
ADD FILE 
(
    NAME = 'FN1', 
    -- Change to suit your system
    FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL12.SQL2014\MSSQL\DATA\FN1.mod'
)
TO FILEGROUP FG1;
GO
USE InMem;
GO
CREATE TYPE dbo.FizzBuzzTableType AS TABLE 
(
    n integer NOT NULL INDEX i,
    FizzBuzz varchar(8) NOT NULL
) WITH (MEMORY_OPTIMIZED = ON);
GO


本机过程:

CREATE PROCEDURE dbo.FizzBuzz
WITH 
    NATIVE_COMPILATION, 
    SCHEMABINDING, 
    EXECUTE AS OWNER
AS
BEGIN ATOMIC 
WITH 
(
    TRANSACTION ISOLATION LEVEL = SNAPSHOT, 
    LANGUAGE = N'english'
)   
    DECLARE @n AS dbo.FizzBuzzTableType;

    DECLARE @i integer = 1;
    WHILE @i < 50
    BEGIN
        IF @i % 15 = 0
        BEGIN
            INSERT @n (n, FizzBuzz) 
            VALUES (@i, 'FizzBuzz')
        END
        ELSE 
        BEGIN
            IF @i % 3 = 0
            BEGIN
                INSERT @n (n, FizzBuzz)
                VALUES (@i, 'Fizz')
            END
            ELSE 
            BEGIN
                IF @i % 5 = 0
                BEGIN
                    INSERT @n (n, FizzBuzz) 
                    VALUES (@i, 'Buzz')
                END
                ELSE
                BEGIN
                    INSERT @n (n, FizzBuzz) 
                    VALUES (@i, CONVERT(varchar(8), @i));
                END;
            END;
        END;

        SET @i += 1;
    END;

    SELECT
        N.n, 
        N.FizzBuzz
    FROM @n AS N
    ORDER BY
        N.n;
END;


测试:

SET NOCOUNT ON;
PRINT SYSUTCDATETIME();
GO
DECLARE @T AS dbo.FizzBuzzTableType;

INSERT @T (n, FizzBuzz)
EXECUTE dbo.FizzBuzz;
GO 100

PRINT SYSUTCDATETIME();


典型结果:


-- 95ms for 100 iterations, < 1ms each
2014-12-31 10:07:13.7993355
Beginning execution loop
Batch execution completed 100 times.
2014-12-31 10:07:13.8943409


将过程输出写入一个内存中的表变量,因为否则我们只是测试在SSMS中显示结果的速度。

一百万行

上面的本机过程大约需要12秒在1,000,000个数字上运行。在T-SQL中,有各种各样更快的方法可以完成相同的操作。我之前写过的一本书如下。当达到预期的并行计划时,它将在我的笔记本电脑上运行500万行,大约需要500毫秒:

IF  OBJECT_ID(N'tempdb..#Result', N'U') IS NOT NULL
    DROP TABLE #Result;

IF  OBJECT_ID(N'tempdb..#Thousand', N'U') IS NOT NULL
    DROP TABLE #Thousand;

SET NOCOUNT ON;
DECLARE @start datetime2(7) = SYSUTCDATETIME();

CREATE TABLE #Thousand 
(
    n integer NOT NULL,

    CONSTRAINT PK_#Thousand
    PRIMARY KEY CLUSTERED (n)
);

-- Add 1,000 rows numbered 0-999 to #Thousand
WITH 
    L1 (n) AS
(
    SELECT  V.n
    FROM    
    (
        VALUES  (0), (1), (2), (3), (4),
                (5), (6), (7), (8), (9)
    ) AS V (n)
),
    Thousand AS
(
    SELECT  n = 
        CONVERT
        (
            integer,
            ROW_NUMBER() OVER (
            ORDER BY (SELECT NULL))
            - 1
        )
    FROM L1
    CROSS JOIN L1 AS L2
    CROSS JOIN L1 AS L3
)
INSERT #Thousand (n)
SELECT n
FROM Thousand;

-- To hold the Fizz Buzz output
CREATE TABLE #Result 
(
    n integer NOT NULL, 
    result varchar(8) NOT NULL
);

INSERT #Result
SELECT 
    Million.n, 
    Million.result
FROM
(
    -- Modulo operation to encourage few outer rows parallelism
    SELECT  n
    FROM    #Thousand
    WHERE   n % 1 = 0
) AS T1
-- Outer Apply to keep the Compute Scalar parallel
OUTER APPLY
(
    SELECT
        F2.n, 
        F2.result
    FROM #Thousand AS T2
    CROSS APPLY
    (
        -- Row numbers 1 to 1,000,000
        SELECT  (T1.n * 1000) + T2.n + 1
    ) AS F1 (n)
    CROSS APPLY
    (
        -- The Fizz Buzz bit
        SELECT
            F1.n,
            result =
                CASE 
                    WHEN F1.n % 15 = 0 THEN 'FizzBuzz'
                    WHEN F1.n % 3 = 0 THEN 'Buzz'
                    WHEN F1.n % 5 = 0 THEN 'Fizz'
                    ELSE CONVERT(varchar(8), F1.n)
                END
    ) AS F2
) AS Million
OPTION  (MAXDOP 4, QUERYTRACEON 9481);

PRINT DATEDIFF(MILLISECOND, @start, SYSUTCDATETIME());


评论


您说过“外部申请,以保持计算标量平行”-您能指出我的意思吗?

– Max Vernon♦
2014年12月31日在15:03

不。正是这样,外部应用有时可以帮助防止计算标量移出并行区域。完全没有记录,可以随时更改。

–保罗·怀特♦
2014年12月31日15:06

#3 楼

这个在我的机器上运行的时间与您的第一个(0毫秒)相同。我不确定它是否会扩展得更快。

;WITH t AS (
    SELECT RowNum = ROW_NUMBER() OVER (ORDER BY o.object_id)
    FROM sys.objects o
)
SELECT t.RowNum
    , Cxa.Fizz + CxB.Buzz
FROM t
CROSS APPLY (SELECT CASE WHEN t.RowNum % 3 = 0 THEN 'FIZZ' ELSE '' END) CxA(Fizz)
CROSS APPLY (SELECT CASE WHEN t.RowNum % 5 = 0 THEN 'BUZZ' ELSE '' END) CxB(Buzz)
WHERE t.RowNum < 50;


#4 楼

我想出的最佳版本可以在计算机上运行30毫秒:

WITH t AS (
    SELECT 1 As RowNum
    Union ALL
    Select RowNum + 1
    From t
    Where RowNum < 49
)
SELECT t.RowNum
, SubString('FIZZ', (t.RowNum % 3)*10, 5) + SubString('BUZZ', (t.RowNum % 5)*10, 5)
FROM t;


#5 楼

根据sqlfiddle.com的说法,这需要7毫秒:

select coalesce(fizz + buzz, fizz, buzz, cast(n as varchar)) as FizzBuzz
  from (
    select n0 + 3 * n3 + 9 * n9 + 27 * n27 + 81 * n81 as n
        from
            (select 0 as n0  union all select 1 union all select 2 as n0)  as n0,
            (select 0 as n3  union all select 1 union all select 2 as n3)  as n3,
            (select 0 as n9  union all select 1 union all select 2 as n9)  as n9,
            (select 0 as n27 union all select 1 union all select 2 as n27) as n27,
            (select 0 as n81 union all select 1                    as n81) as n81
  ) as stupidalias1
  left outer join
    (select 3 as fizzstep, 'Fizz' as fizz) as stupidalias2 on n % fizzstep = 0
  left outer join
    (select 5 as buzzstep, 'Buzz' as buzz) as stupidalias3 on n % buzzstep = 0
  where n between 1 and 100
  order by n;


不使用任何表,存储的proc或CTE。

#6 楼

我得到了一个本机编译的存储过程的合理版本,该过程在500-800毫秒内可处理100万行。这是我在这里进行的T-SQL转换,在亚当·马汉尼(Adam Machanic)关于此处按位操作的博客的帮助下,从这里进行了按位算法。

我(希望)遵循与@PaulWhite的500ms相同的规则/ 1百万行proc,即生成结果但不显示/不将它们作为计时的一部分传递出去。它对于内存速度表必须是内存表上的哈希索引,对于我来说,存储桶大小为4,194,304或8,388,608似乎是我的最佳选择,尽管显然可以提供较高的空存储桶数。

USE hekatondb
GO

--NB: SQLCMD script, Enable via: Query menu > SQLCMD Mode
:setvar bucketCount 4194304
--:setvar bucketCount 8388608

IF OBJECT_ID('dbo.usp_hekatonFizzBuzz') IS NOT NULL
DROP PROC dbo.usp_hekatonFizzBuzz 
GO
IF OBJECT_ID('dbo.FizzBuzz') IS NOT NULL
DROP TABLE dbo.FizzBuzz
GO


IF OBJECT_ID('dbo.FizzBuzz') IS NULL
CREATE TABLE dbo.FizzBuzz (
    Number      INT NOT NULL,
    Result      VARCHAR(8) NULL,

    CONSTRAINT PK_FizzBuzz PRIMARY KEY NONCLUSTERED HASH ( Number ) WITH ( BUCKET_COUNT = $(bucketCount) )

) WITH ( MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_ONLY )
GO


CREATE PROC dbo.usp_hekatonFizzBuzz 

    @limit  INT

WITH
    NATIVE_COMPILATION, 
    SCHEMABINDING, 
    EXECUTE AS OWNER
AS
BEGIN ATOMIC
WITH
(
    TRANSACTION ISOLATION LEVEL = SNAPSHOT, 
    LANGUAGE = N'english'
)   

    DECLARE @acc INT = 810092048    -- 11 00 00 01 00 10 01 00 00 01 10 00 01 00 00
    DECLARE @i INT = 1
    DECLARE @c INT

    WHILE @i <= @limit
    BEGIN 

        SELECT
            @c = @acc & 3,
            @acc = ( @acc / 4 ) | ( @c * 268435456 )

        INSERT dbo.FizzBuzz ( Number, Result )
        SELECT @i, SUBSTRING( '       Fizz    Buzz    FizzBuzz', @c * 8, @c * 4 )

        SET @i += 1

    END

END
GO

DELETE dbo.FizzBuzz
DECLARE @startDate DATETIME2 = SYSUTCDATETIME();

EXEC dbo.usp_hekatonFizzBuzz 1000000

SELECT DATEDIFF( millisecond, @startDate, SYSUTCDATETIME() ) diff
GO 10


#7 楼

我找到并玩了没有CTE的单个子选择。查询统计信息中的max_elapsed_time显示1036

 SELECT num,
        CASE    WHEN mod3 + mod5 = 0 THEN 'FizzBuzz'
                WHEN mod5 = 0 THEN 'Buzz'
                WHEN mod3 = 0 THEN 'Fizz'
                ELSE CONVERT(VARCHAR(8), num)
        END
 FROM 
 (
    SELECT  number as num,
            number % 3 AS mod3,
            number % 5 AS mod5
    FROM    master.dbo.spt_values
    WHERE   name IS NULL
            AND number BETWEEN 1 AND 101
 ) AS numbers;


#8 楼

我对编写的代码一无所知,我只想看一看十亿行就需要多长时间!

;WITH T(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))
    ,NUMS(N) AS (SELECT TOP(1000000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))  AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9, T T10)
    SELECT  n, ca.fb
    INTO #fizzywizzy
    FROM    NUMS n
            CROSS APPLY ( SELECT    CASE WHEN n.N % 15 = 0 THEN 'FizzBuzz'
                                         WHEN n.N % 3 = 0 THEN 'Fizz'
                                         WHEN n.N % 5 = 0 THEN 'Buzz'
                                         ELSE CAST(n AS VARCHAR)
                                    END AS [fb]
                        ) ca


答案是:大约10分钟。

SQL Server parse and compile time: 
   CPU time = 13 ms, elapsed time = 13 ms.

 SQL Server Execution Times:
   CPU time = 648625 ms,  elapsed time = 618025 ms.


#9 楼

PostgreSQL

PostgreSQL提供generate_series,这是一个表值函数(设置返回函数),它使此操作变得更加简单。我假设当数字3或5都不输入时,您什么都不想要。

SELECT x, str
FROM generate_series(1,49) AS gs(x)
CROSS JOIN LATERAL (VALUES (CASE
  WHEN x % 15 =0 THEN 'Fizzbuzz'
  WHEN x % 3  =0 THEN 'Fizz'
  WHEN x % 5  =0 THEN 'Buzz'
END)) AS c(str)
WHERE str IS NOT NULL;