In this vignette we provide examples of how to use entrydatar to download precise files from the QCEW.
In the second part we give rudimentary documentations of the feature that are of principal interest to work with the data. For an overview of the QCEW, head over to the BLS website

The vignette is organized as:

  1. Downloading data from the QCEW
  2. Documentation

Downloading data from the QCEW

To get read to download data, we load some libraries that the package might have forgotten to call (the package works with all tables in a data.table format)

library(dplyr); library(data.table); 
library(entrydatar)

General download

For example if we are interested in downloading aggregate level data we use the cut 10. Please see documentation below for a definition of what is availabe and the mapping of the different cuts of QCEW.

as_tibble(dt_agg)
# A tibble: 16 x 42
   area_fips own_code indus… agglv… size_…  year   qtr discl… qtrly… month… month… month… total_… taxa… qtrl… avg_… lq_d… lq_q… lq_m… lq_m… lq_m… lq_t… lq_t… lq_q… lq_a… oty_… oty_q… oty_… oty_mo… oty_m… oty_mo… oty_m… oty_mo… oty_m… oty_to… oty_t… oty_tax… oty_… oty_qt… oty_… oty_a… oty_…
   <chr>        <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <chr>   <dbl>  <dbl>  <dbl>  <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl> <dbl>   <dbl>  <dbl>   <dbl>  <dbl>   <dbl>  <dbl>   <dbl>  <dbl>    <dbl> <dbl>   <dbl> <dbl>  <dbl> <dbl>
 1 US000            0   10.0   10.0      0  1990  1.00 ""     6.02e⁶ 1.06e⁸ 1.07e⁸ 1.08e⁸ 6.23e¹¹     0     0   448    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 127332  2.20  2.09e⁶  2.00   2.12e⁶  2.00   2.03e⁶  1.90  3.91e¹⁰  6.70  -3.57e¹¹  -100 -7.56e⁹  -100  20.0   4.70
 2 US000            0   10.0   10.0      0  1990  2.00 ""     6.04e⁶ 1.08e⁸ 1.09e⁸ 1.10e⁸ 6.29e¹¹     0     0   443    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 154120  2.60  1.87e⁶  1.80   2.12e⁶  2.00   1.82e⁶  1.70  3.67e¹⁰  6.20  -1.82e¹¹  -100 -4.01e⁹  -100  19.0   4.50
 3 US000            0   10.0   10.0      0  1990  3.00 ""     6.09e⁶ 1.08e⁸ 1.09e⁸ 1.10e⁸ 6.24e¹¹     0     0   441    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 164809  2.80  1.68e⁶  1.60   1.79e⁶  1.70   1.34e⁶  1.20  3.37e¹⁰  5.70  -1.10e¹¹  -100 -2.47e⁹  -100  18.0   4.30
 4 US000            0   10.0   10.0      0  1990  4.00 ""     6.12e⁶ 1.09e⁸ 1.09e⁸ 1.09e⁸ 6.88e¹¹     0     0   483    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 149102  2.50  1.01e⁶  0.900  6.69e⁵  0.600  2.75e⁵  0.300 3.90e¹⁰  6.00  -8.46e¹⁰  -100 -1.85e⁹  -100  25.0   5.50
 5 US000            0   10.0   10.0      0  1991  1.00 ""     6.30e⁶ 1.05e⁸ 1.05e⁸ 1.06e⁸ 6.37e¹¹     0     0   465    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 284012  4.70 -1.03e⁶ -1.00  -1.51e⁶ -1.40  -1.77e⁶ -1.60  1.40e¹⁰  2.20   0           0  0          0  17.0   3.80
 6 US000            0   10.0   10.0      0  1991  2.00 ""     6.36e⁶ 1.06e⁸ 1.07e⁸ 1.08e⁸ 6.45e¹¹     0     0   463    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 317123  5.30 -1.84e⁶ -1.70  -2.05e⁶ -1.90  -2.24e⁶ -2.00  1.64e¹⁰  2.60   0           0  0          0  20.0   4.50
 7 US000            0   10.0   10.0      0  1991  3.00 ""     6.39e⁶ 1.06e⁸ 1.07e⁸ 1.08e⁸ 6.41e¹¹     0     0   462    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 303568  5.00 -2.13e⁶ -2.00  -1.93e⁶ -1.80  -2.03e⁶ -1.90  1.76e¹⁰  2.80   0           0  0          0  21.0   4.80
 8 US000            0   10.0   10.0      0  1991  4.00 ""     6.43e⁶ 1.08e⁸ 1.08e⁸ 1.08e⁸ 7.05e¹¹     0     0   503    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 317664  5.20 -1.70e⁶ -1.60  -1.61e⁶ -1.50  -1.52e⁶ -1.40  1.72e¹⁰  2.50   0           0  0          0  20.0   4.10
 9 US000            0   10.0   10.0      0  1992  1.00 ""     6.48e⁶ 1.05e⁸ 1.05e⁸ 1.06e⁸ 6.64e¹¹     0     0   486    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 176781  2.80 -5.04e⁵ -0.500 -1.48e⁵ -0.100 -1.06e⁵ -0.100 2.72e¹⁰  4.30   0           0  0          0  21.0   4.50
10 US000            0   10.0   10.0      0  1992  2.00 ""     6.50e⁶ 1.07e⁸ 1.08e⁸ 1.09e⁸ 6.72e¹¹     0     0   479    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 140624  2.20  5.39e⁵  0.500  6.40e⁵  0.600  6.08e⁵  0.600 2.69e¹⁰  4.20   0           0  0          0  16.0   3.50
11 US000            0   10.0   10.0      0  1992  3.00 ""     6.52e⁶ 1.07e⁸ 1.07e⁸ 1.08e⁸ 6.72e¹¹     0     0   482    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 131248  2.10  7.06e⁵  0.700  2.48e⁵  0.200  6.40e⁵  0.600 3.06e¹⁰  4.80   0           0  0          0  20.0   4.30
12 US000            0   10.0   10.0      0  1992  4.00 ""     6.57e⁶ 1.09e⁸ 1.09e⁸ 1.09e⁸ 7.79e¹¹     0     0   550    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 134740  2.10  1.09e⁶  1.00   1.05e⁶  1.00   1.22e⁶  1.10  7.36e¹⁰ 10.4    0           0  0          0  47.0   9.30
13 US000            0   10.0   10.0      0  1993  1.00 ""     6.62e⁶ 1.06e⁸ 1.07e⁸ 1.07e⁸ 6.67e¹¹     0     0   480    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 138525  2.10  1.43e⁶  1.40   1.70e⁶  1.60   1.69e⁶  1.60  2.54e ⁹  0.400  0           0  0          0 - 6.00 -1.20
14 US000            0   10.0   10.0      0  1993  2.00 ""     6.65e⁶ 1.09e⁸ 1.10e⁸ 1.11e⁸ 7.04e¹¹     0     0   494    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 155234  2.40  1.85e⁶  1.70   1.75e⁶  1.60   1.98e⁶  1.80  3.19e¹⁰  4.70   0           0  0          0  15.0   3.10
15 US000            0   10.0   10.0      0  1993  3.00 ""     6.70e⁶ 1.09e⁸ 1.09e⁸ 1.11e⁸ 7.11e¹¹     0     0   499    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 176159  2.70  2.15e⁶  2.00   2.31e⁶  2.20   2.35e⁶  2.20  3.86e¹⁰  5.70   0           0  0          0  17.0   3.50
16 US000            0   10.0   10.0      0  1993  4.00 ""     6.75e⁶ 1.11e⁸ 1.11e⁸ 1.12e⁸ 8.06e¹¹     0     0   556    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 176528  2.70  2.42e⁶  2.20   2.54e⁶  2.30   2.80e⁶  2.60  2.76e¹⁰  3.50   0           0  0          0   6.00  1.10

Note that the dataset can be large. Downloading most of the industry 3 and 4 digits cuts across counties and MSAs ends up with 57mn rows. Saving it in .rds formats takes around 2gb of memory.

On the other hand for something more precise, say nationally by size cuts at the 6 digits industry level we would call the cut 28:

as_tibble(dt_naics)
# A tibble: 208,667 x 47
   area… own_… indus… aggl… size…  year   qtr disc… area… own_… indu… aggl… size… qtrly… month… month… month… total… taxa… qtrl… avg_… lq_d… lq_q… lq_m… lq_m… lq_m… lq_t… lq_t… lq_q… lq_a… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_…
   <chr> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1 US000  5.00 111110  28.0  1.00  1990  1.00    NA    NA    NA    NA    NA    NA 2.16354    353    371   1.57e⁶     0     0   337    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 2 US000  5.00 111110  28.0  2.00  1990  1.00    NA    NA    NA    NA    NA    NA 9.80547    577    640   1.95e⁶     0     0   255    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 3 US000  5.00 111110  28.0  3.00  1990  1.00    NA    NA    NA    NA    NA    NA 4.00439    450    519   1.32e⁶     0     0   216    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 4 US000  5.00 111110  28.0  4.00  1990  1.00    NA    NA    NA    NA    NA    NA 1.40336    336    382   1.01e⁶     0     0   220    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 5 US000  5.00 111120  28.0  1.00  1990  1.00    NA    NA    NA    NA    NA    NA 2.6039.0   33.0   39.0 1.30e⁵     0     0   270    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 6 US000  5.00 111120  28.0  2.00  1990  1.00    NA    NA    NA    NA    NA    NA 5.00e⁰   21.0   23.0   28.0 6.87e⁴     0     0   220    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 7 US000  5.00 111130  28.0  1.00  1990  1.00    NA    NA    NA    NA    NA    NA 8.2076.0   73.0   85.0 3.28e⁵     0     0   324    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 8 US000  5.00 111140  28.0  1.00  1990  1.00    NA    NA    NA    NA    NA    NA 1.921369   1314   1575   5.09e⁶     0     0   276    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 9 US000  5.00 111140  28.0  2.00  1990  1.00    NA    NA    NA    NA    NA    NA 8.50374    404    526   1.34e⁶     0     0   237    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
10 US000  5.00 111140  28.0  4.00  1990  1.00    NA    NA    NA    NA    NA    NA 8.00e⁰  193    211    224   7.45e⁵     0     0   274    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
# ... with 208,657 more rows

Faster download with annual frequencies

The BLS provides quarterly as well as annual averages, which are lighter. There is now an option to download directly the annual average using a frequency option. The default is quarterly, so everything should be backward compatible. Let’s see first with the quarterly data (warning it is slow):

as_tibble(dt_naics)
# A tibble: 18,028 x 42
   area_fips own_code indus… agglv… size_…  year   qtr discl… qtrly_… month… month… month… total… taxa… qtrl… avg_… lq_d… lq_q… lq_m… lq_m… lq_m… lq_t… lq_t… lq_q… lq_a… oty_… oty_… oty_q… oty_mo… oty_mo… oty_mo… oty_m… oty_mo… oty_m… oty_to… oty_to… oty_… oty_… oty_… oty_… oty_av… oty_av…
   <chr>        <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <chr>    <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>   <dbl>   <dbl>   <dbl>  <dbl>   <dbl>  <dbl>   <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>
 1 US000         1.00   1125   16.0      0  2000  1.00 ""        2.00 7.00e⁰ 7.00e⁰ 7.00e⁰ 5.24e⁴     0     0   576    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  0      0    -  1.00 -12.5   -  1.00 -12.5  -  1.00 -12.5  -1.90- 3.50      0     0     0     0   54.0   10.3  
 2 US000         1.00   1125   16.0      0  2000  2.00 ""        2.00 7.00e⁰ 7.00e⁰ 8.00e⁰ 5.59e⁴     0     0   587    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  0      0    -  1.00 -12.5      0      0       1.00  14.3   1.26e⁴  29.0       0     0     0     0  132     29.0  
 3 US000         1.00   1125   16.0      0  2000  3.00 ""        2.00 8.00e⁰ 8.00e⁰ 8.00e⁰ 5.99e⁴     0     0   576    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  0      0       1.00  14.3      1.00  14.3     1.00  14.3   9.7419.4       0     0     0     0   25.0    4.50 
 4 US000         1.00   1125   16.0      0  2000  4.00 ""        3.00 8.00e⁰ 8.00e⁰ 8.00e⁰ 6.42e⁴     0     0   617    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  1.00  50.0     1.00  14.3      1.00  14.3     1.00  14.3   1.34e⁴  26.4       0     0     0     0   59.0   10.6  
 5 US000         1.00   1131   16.0      0  2000  1.00 ""       78.0  4.093.923.933.66e⁷     0     0   706    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA -1.00 - 1.30 -237    - 5.50  -217    - 5.20 -282    - 6.70 -1.22e⁵ - 0.300     0     0     0     0   38.0    5.70 
 6 US000         1.00   1131   16.0      0  2000  2.00 ""       78.0  4.194.575.264.77e⁷     0     0   786    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  1.00   1.30 -315    - 7.00  -255    - 5.30  250      5.00  7.70e⁶  19.2       0     0     0     0  142     22.0  
 7 US000         1.00   1131   16.0      0  2000  3.00 ""       78.0  5.855.675.475.86e⁷     0     0   796    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  0      0    - 97.0  - 1.60  -167    - 2.90 -110    - 2.00 -1.43e⁶ - 2.40      0     0     0     0 -  1.00 - 0.100
 8 US000         1.00   1131   16.0      0  2000  4.00 ""       78.0  4.844.494.435.09e⁷     0     0   853    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  0      0    -502    - 9.40  -146    - 3.10  205      4.80 -1.07e⁵ - 0.200     0     0     0     0   25.0    3.00 
 9 US000         1.00   2211   16.0      0  2000  1.00 ""      140    1.23e⁴ 1.23e⁴ 1.23e⁴ 1.91e⁸     0     0  1193    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA -5.00 - 3.40 -201    - 1.60  -157    - 1.30 -125    - 1.00  1.54e⁷   8.80      0     0     0     0  110     10.2  
10 US000         1.00   2211   16.0      0  2000  2.00 ""      141    1.24e⁴ 1.24e⁴ 1.24e⁴ 1.83e⁸     0     0  1138    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA -2.00 - 1.40   49.0    0.400    5.00   0       6.00   0     2.54e⁷  16.1       0     0     0     0  157     16.0  
# ... with 18,018 more rows

Now let’s download the data only at annual frequency (similar to the size cuts from earlier):

as_tibble(dt_naics_year)
# A tibble: 4,507 x 38
   area_fips own_code industry_code agglvl_code size_code  year   qtr disclosure_code annual… annual… total_… taxab… annua… annua… avg_a… lq_di… lq_a… lq_a… lq_t… lq_t… lq_a… lq_a… lq_a… oty_… oty_… oty_a… oty_an… oty_ann… oty_to… oty_tot… oty_… oty_… oty_… oty_… oty_a… oty_a… oty_… oty_a…
   <chr>        <dbl>         <dbl>       <dbl>     <dbl> <dbl> <dbl> <chr>             <dbl>   <dbl>   <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>   <dbl>    <dbl>   <dbl>    <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl> <dbl>  <dbl>
 1 US000         1.00          1125        16.0         0  2000    NA ""                 2.00  8.00e⁰  2.32e⁵      0      0    589  30648     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0     1.00e⁰   14.3    3.38e⁴   17.0       0     0     0     0   68.0  13.1   3567  13.2 
 2 US000         1.00          1131        16.0         0  2000    NA ""                78.0   4.731.94e⁸      0      0    788  40994     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0    -1.56-  3.20   6.04e⁶    3.20      0     0     0     0   49.0   6.60  2547   6.60
 3 US000         1.00          2211        16.0         0  2000    NA ""               142     1.24e⁴  7.90e⁸      0      0   1222  63557     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA -2.00 - 1.40 -5.10-  0.400  8.53e⁷   12.1       0     0     0     0  136    12.5   7096  12.6 
 4 US000         1.00          2213        16.0         0  2000    NA ""                10.0   9.372.67e⁷      0      0    547  28463     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  1.00  11.1   1.1413.9   -4.72e⁶ - 15.0       0     0     0     0 -186   -25.4  -9643 -25.3 
 5 US000         1.00          2362        16.0         0  2000    NA ""                 1.00  1.00e⁰  6.73e⁴      0      0   1294  67270     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0     1.00e⁰  100      4.32e⁴  180         0     0     0     0  184    16.6   9555  16.6 
 6 US000         1.00          3231        16.0         0  2000    NA ""                 8.00  4.862.69e⁸      0      0   1063  55287     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0    -6.10-  1.20   8.19e⁵    0.300     0     0     0     0   16.0   1.50   852   1.60
 7 US000         1.00          3259        16.0         0  2000    NA ""                 3.00  3.101.67e⁷      0      0   1037  53945     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0    -2.00e⁰ -  0.600  1.30e⁶    8.40      0     0     0     0   84.0   8.80  4396   8.90
 8 US000         1.00          3321        16.0         0  2000    NA ""                 3.00  4.753.06e⁷      0      0   1241  64540     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  1.00  50.0   1.2636.1    9.43e⁶   44.5       0     0     0     0   72.0   6.20  3748   6.20
 9 US000         1.00          3329        16.0         0  2000    NA ""                10.0   1.23e⁴  6.24e⁸      0      0    976  50746     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0    -1.32-  9.70   5.99e⁶    1.00      0     0     0     0  104    11.9   5379  11.9 
10 US000         1.00          3364        16.0         0  2000    NA ""                 1.00  2.606.86e⁷      0      0    508  26390     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0    -4.50-  1.70  -1.19e⁷ - 14.8       0     0     0     0 - 77.0 -13.2  -4048 -13.3 
# ... with 4,497 more rows

Documentation

Cuts in QCEW

SIC Aggregation levels

Be careful as the SIC aggregation levels are different than for NAICS. See the reference table and official BLS documentation for more details.

The division follows

  • 01 to 11: National level; all sectors down to 4 digits; aggregated and by size classes
  • 12 to 17: MSA level; all sectors down to 4 digits; aggregated (no split by size classes available)
  • 18 to 25: State level; all sectors down to 4 digits; aggregated and by size classes (only some size split are available)
  • 12 to 17: County level; all sectors down to 4 digits; aggregated (no split by size classes available)

What is available

The data file layout for a general view of what is available:

NAICS files

Some cuts are limited to an annual frequency (first quarter only) while some have a quarterly frequency (with monthly reports for employment)

  • Annual frequency cuts are:
    • National by size: 21, 22, 23, 24, 25, 26, 27, 28
    • State by size: 61, 62, 63, 64
  • Quarterly frequency: everything else

SIC files

Same as for NAICS. Collections stops in 2000. Size starts only in 1997.

  • Annual frequency cuts are availabe only for size from 1997 to 2000:
    • National by size: 7, 8, 9, 10, 11
    • State by size: 24, 25
  • Quarterly frequency: everything else

Availability Warning

The BLS Table can be misleading at times. And all of the files are not created equals. I am working on harmonizing all of the data pulls but there seems to be a lot of exceptions. Here are warnings I found important:

  1. There is no disaggregated data for size X industry before 1990 in the case of naics. The layout of files on the download page is somewhat misleading.
  2. For the sic by industry files the data from standard sources only goes back to 1984. I have yet to implement the function that also downloads and get the data from 1975 to 1983. They are arranged in individual industry files.
    • For the year 1983 the available cuts are: 01, 02, 03, 04, 05, 06, 13, 14, 15, 18, 19, 20, 21, 22, 23, 26, 27, 28, 29
    • For the year 1984 the available cuts are: 01, 02, 03, 04, 05, 06, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 26, 27, 28, 29, 30, 31
    • The missing cuts pre-1984 are: 16, 17 (MSA 3 and 4 digits SIC) and 30, 31 (County 3 and 4 digits SIC)

General documentation

The table of contents to download directly datasets

The data file layout for a general view of what is available:

Data codes

Aggregation levels

  • Aggregation levels and files that contain them are defined by the BLS; we reproduce the table in the package for merging or easier access:

Size Classes

Industries

Industry titles are standard in that case

Ownership

Ownership codes go from 0 to 6:

  • 0 represents the aggregate or Total Covered
  • 5 represents the Private sector
  • 1 to 4 represent different level of government: 4 for International Government; 3 for Local Government; 2 for State Government and 1 for Federal Government

Online docs for naics and sic

Area codes and titles (FIPS)

Structures somewhat like industry codes. 5 characters. US000 is aggregated over the total US. Then for example XXYYY can be split in two parts:

  • XX represents the state as in Census codes (alphabetical orders): 01 is Alabama and 02 is Alaska.
  • YYY represents the county within the given state
  • There are exceptions to YYY matching to counties:
    • If YYY is 000 then this represents data aggregated at the state level: 01000,
    • YYY is 996 then it represents “Overseas Locations”
    • YYY is 997 then it represents “Multicounty, Not Statewide”
    • YYY is 998 then it represents “Out-of-State”
    • YYY is 999 then it represents “Unknown Or Undefined”
  • If the first character is C then it represents subdivisions at the MSA level

BLS documentation is available at the following links:

1: The stata version of this code is on Gabe’s website here
  1. Erik Loualiche