How It Works: Bulk Insert (BCP.exe) – Injecting a Sort Operation

How It Works: Bulk Insert (BCP.exe)​​ – Injecting a Sort Operation

 

 

I ran across an interesting issue this week related to the use of Bulk Insert/BCP and sorting.

The customer reported that they could not BCP into a table using “keepidentity” and after more debugging I found that “keepidentity” was not the​​ issue but​​ the issue​​ was the addition of a​​ clustered​​ index.

The table is defined with 100s of sparse columns:

CREATE​​ TABLE​​ [dbo].[Mytable](

 [DataId] [int]​​ IDENTITY(1,1)​​ NOT​​ FOR​​ REPLICATION​​ NOT​​ NULL,

 [MainId] [int]​​ NOT​​ NULL,

 [YearId] [char](2)​​ NOT​​ NULL,

 [UserId] [int]​​ NOT​​ NULL,

 [DateAdded] [datetime]​​ NOT​​ NULL,

 [DateLstMod] [datetime]​​ NOT​​ NULL,

 [C1] [varchar](max)​​ SPARSE​​ NULL,

 [C2] [varchar](max)​​ SPARSE​​ NULL,

 [C3] [varchar](max)​​ SPARSE​​ NULL,

[C547] [varchar](max)​​ NULL,

 [C548] [varchar](max)​​ NULL,

 [C549] [varchar](max)​​ NULL);

A row was added to the table and bcp.exe used to export the data to a file​​ followed by​​ bcp.exe to import the data.​​ 

When the table has no-index the import succeeds but when the following index (it turns out any clustered index​​ is present) the import fails.

CONSTRAINT​​ [Pk_[DataId]​​ PRIMARY​​ KEY​​ CLUSTERED​​ 

(

 [DataId]​​ ASC

)WITH​​ (PAD_INDEX​​ =​​ OFF,​​ STATISTICS_NORECOMPUTE​​ =​​ OFF,​​ IGNORE_DUP_KEY​​ =​​ OFF,​​ ALLOW_ROW_LOCKS​​ =​​ ON,​​ ALLOW_PAGE_LOCKS​​ =​​ ON,​​ FILLFACTOR​​ =​​ 95,​​ OPTIMIZE_FOR_SEQUENTIAL_KEY​​ =​​ OFF)​​ ON​​ [PRIMARY]

)​​ ON​​ [PRIMARY]​​ TEXTIMAGE_ON​​ [PRIMARY]

SQLState = 37000, NativeError = 511​​ Error = [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]​​ Cannot create a row of size 8161 which is greater than the allowable maximum row size of 8060.

I was able to track the behavior the use of a sort for the bulk operation. ​​ When an index/constraint is present bulk insert may choose to sort the incoming data in a temporary table. ​​ The incoming stream is using a sort and the​​ additional​​ overhead for the sort causes the row to exceed the size limit.

BCP SORTS:​​ Bulk Insert/BCP uses a set of heuristics to determine when to inject and sort the incoming data.  ​​​​ In general, bcp.exe and bulk insert are considered unknown sort sources and a sort will be injected to handle index/constraint operations.


You can use the following to get a ballpark idea of the row size. ​​ In this​​ customer’s​​ example the size was​​ just under the row limit. ​​ When the sort is injected the row size exceeds the limit,​​ generating​​ to the error.

-- Estimate the row size

--

declare​​ @dynamicsql​​ varchar(max)

set​​ @dynamicsql​​ =​​ 'select [DataId], (0'

 

select​​ @dynamicsql​​ =​​ @dynamicsql​​ +​​ ' + isnull(datalength('​​ +​​ name​​ +​​ '), 0)'​​ from​​ syscolumns​​ where​​ id​​ =​​ object_id('MyTable') ​​​​ and​​ xusertype​​ <>​​ 167

 

select​​ @dynamicsql​​ =​​ @dynamicsql​​ +​​ ' + 24'​​ 

from​​ syscolumns​​ where​​ id​​ =​​ object_id('MyTable') ​​​​ and​​ xusertype​​ =​​ 167

 

set​​ @dynamicsql​​ =​​ @dynamicsql​​ +​​ ') as rowsize from​​ MyTable2 order by DataId'

 

select​​ @dynamicsql

exec​​ (@dynamicsql)

Workaround
There are various workarounds to the issue. ​​ Use non-clustered index,​​ drop the index,​​ import,​​ and create the index, or​​ BCP into a table without an index and​​ then​​ insert into the target table.

Stack​​ Proof
The following stack shows the Bulk operation with the additional sort taking place.

    01​​ sqldk!ex_raise

    02 sqlmin!LongRecord::SetDataInternal

    03 sqlmin!LongRecord::SetData

    04 sqlmin!CreateNewRecordNoCheck

    05 sqlmin!CreateNewRecord

    06 sqlmin!RowsetSorted::InsertRow

    07 sqlmin!CValRowNoHrow::SetDataX

    08 sqlTsEs!CallEsFn

    09 sqlTsEs!CEsExec::GeneralEval

    0a sqlmin!CEsRuntime::Eval

   ​​ 0b sqlmin!CQScanSortNew::PushRow

    0c sqlmin!CQScanSortNew::BuildSortTable

    0d sqlmin!CQScanSortNew::OpenHelper

    0e sqlmin!CQScanNew::OpenHelper

    0f sqlmin!CQScanUpdateNew::Open

    10 sqlmin!CQueryScan::StartupQuery

    11 sqllang!CXStmtQuery::SetupQueryScanAndExpression

    12 sqllang!CXStmtQuery::InitForExecute

    13 sqllang!CXStmtQuery::ErsqExecuteQuery

    14 sqllang!CXStmtDML::XretDMLExecute

   ​​ 15 sqllang!CXStmtInsertBulk::XretDoExecute

    16 sqllang!CXStmtInsertBulk::XretExecute

    17 sqllang!CExecStmtLoopVars::ExecuteXStmtAndSetXretReturn

   ​​ 18 sqllang!CMsqlExecContext::ExecuteStmts<1,0>

    19 sqllang!CMsqlExecContext::FExecute

    1a sqllang!CSQLSource::Execute

  • Bob Dorr