How It Works: Bulk Insert (BCP.exe) – Injecting a Sort Operation
I ran across an interesting issue this week related to the use of Bulk Insert/BCP and sorting.
The customer reported that they could not BCP into a table using “keepidentity” and after more debugging I found that “keepidentity” was not the issue but the issue was the addition of a clustered index.
The table is defined with 100s of sparse columns:
CREATE TABLE [dbo].[Mytable](
[DataId] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
[MainId] [int] NOT NULL,
[YearId] [char](2) NOT NULL,
[UserId] [int] NOT NULL,
[DateAdded] [datetime] NOT NULL,
[DateLstMod] [datetime] NOT NULL,
[C1] [varchar](max) SPARSE NULL,
[C2] [varchar](max) SPARSE NULL,
[C3] [varchar](max) SPARSE NULL,
[C547] [varchar](max) NULL,
[C548] [varchar](max) NULL,
[C549] [varchar](max) NULL);
A row was added to the table and bcp.exe used to export the data to a file followed by bcp.exe to import the data.
When the table has no-index the import succeeds but when the following index (it turns out any clustered index is present) the import fails.
CONSTRAINT [Pk_[DataId] PRIMARY KEY CLUSTERED
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 95, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
SQLState = 37000, NativeError = 511 Error = [Microsoft][ODBC Driver 13 for SQL Server][SQL Server] Cannot create a row of size 8161 which is greater than the allowable maximum row size of 8060.
I was able to track the behavior the use of a sort for the bulk operation. When an index/constraint is present bulk insert may choose to sort the incoming data in a temporary table. The incoming stream is using a sort and the additional overhead for the sort causes the row to exceed the size limit.
BCP SORTS: Bulk Insert/BCP uses a set of heuristics to determine when to inject and sort the incoming data. In general, bcp.exe and bulk insert are considered unknown sort sources and a sort will be injected to handle index/constraint operations.
You can use the following to get a ballpark idea of the row size. In this customer’s example the size was just under the row limit. When the sort is injected the row size exceeds the limit, generating to the error.
-- Estimate the row size
declare @dynamicsql varchar(max)
set @dynamicsql = 'select [DataId], (0'
select @dynamicsql = @dynamicsql + ' + isnull(datalength(' + name + '), 0)' from syscolumns where id = object_id('MyTable') and xusertype <> 167
select @dynamicsql = @dynamicsql + ' + 24'
from syscolumns where id = object_id('MyTable') and xusertype = 167
set @dynamicsql = @dynamicsql + ') as rowsize from MyTable2 order by DataId'
There are various workarounds to the issue. Use non-clustered index, drop the index, import, and create the index, or BCP into a table without an index and then insert into the target table.
The following stack shows the Bulk operation with the additional sort taking place.