Now in their 30th year, the U.S. Sentencing Guidelines (Guidelines) have been used to sentence well over 1.5 million defendants nationwide since Nov. 1, 1987, when they first went into effect. (See U.S. Sentencing Comm’n, 1996-2015 Sourcebooks on Federal Sentencing Statistics, Tbl. 10; U.S. Sentencing Comm’n, Quarterly Data Report (4th Quarter Release), Tbl. 1 (Sept. 30, 2016). From these sources, there were approximately 1.4 million individuals sentenced under the Guidelines.)
Eliminating unwarranted sentencing disparity was the primary goal of the Sentencing Reform Act. (See U.S. Sentencing Comm’n, Fifteen Years of Guidelines Sentencing, 79 (2004)). The act created the U.S. Sentencing Commission (Commission), tasking it with the creation of the Guidelines, and the authority to amend and promulgate new Guidelines from time to time. (See 28 U.S.C. § 994).
Since their inception, the Guidelines have been amended hundreds of times. This process largely has been informed by the data the Commission collects, publishes and analyzes regarding the application of the Guidelines, including sentences imposed, and departures or variances. Although in many instances the Commission has been directed by Congress to make certain changes. In short, the Guidelines have evolved primarily, although not exclusively, as a result of the Commission’s ‘‘empirical approach’’ to sentencing. (See USSG, Ch. 1, Pt. A.)
The purpose of this article is to provide the reader with an overview of what data are available, and to provide suggestions as to how the data may most effectively be used by practitioners in mitigation of punishment.
Breaking Bad . . . Bittersweetly Jesse, a charismatic entrepreneur, is interested in operating a rock candy manufacturing and distribution company. However, he is an awful businessman, having failed in several ventures previously. Further, he has no capital and is unable to obtain any commercial loans. Jesse is introduced to Walter, an upstanding CPA in the community with an excellent reputation, and most importantly, cash, which Walter has accumulated over the course of his successful career. Jesse keeps secret from Walter his past business failures.
Jesse convinces Walter to become a partner with him in the venture, which they call Baby Blue Sweets. Walter invests a few hundred thousand, and, through his connections in the community, secures a significant line of credit for the new business from a local community bank. The line of credit is intended for the purchase of equipment needed to operate the business.
Unfortunately, Jesse, unbeknown to Walter, uses up a significant portion of the line on personal debts and other ‘‘expenditures.’’ When it comes time to purchase the equipment, there is not enough in the line of credit. Further, Walter discovers that the company Baby Blue is ordering the equipment from is actually controlled by Jesse. Walter confronts Jesse, but Jesse, always the charismatic entrepreneur, calms Walter, and even convinces him to ‘‘fib a little’’ to the bank so as to get Baby Blue’s line of credit increased. Against his better judgment, but convinced all will work out, Walter complies by fabricating some documents to the bank and making other material misrepresentations and omissions, including the fact that Jesse controls the company that is selling the equipment to Baby Blue. The bank is fooled and dutifully increases the line.
Walter and Jesse order the equipment, which, of course, is coming from the company Jesse also controls. Not surprisingly, the money paid for the equipment, which ends up with Jesse, disappears and the equipment never arrives. Baby Blue quickly goes belly-up. Walter is out his investment, of course, but the bank is out nearly $2 million and quickly calls the FBI. Walter and Jesse are investigated and charged with conspiracy to commit bank fraud and wire fraud. Jesse quickly decides to cooperate against Walter and works a deal where he will not have to do any time. Now Walter must decide what to do.
A Walk-Through of Walter’s Decision-Making Process Based on Data So, what is Walter’s exposure under the Guidelines (as opposed to the theoretical statutory maximum)? Any sentencing will be pursuant to the socalled Fraud Guidelines at USSG § 2B1.1. As the counts charged each carry at least a 20-year statutory maximum penalty, the base offense level will be seven. (See USSG § 2B1.1(a)(1)). With a $2 million loss, his offense level will be increased by 16 levels. (See USSG § 2B1.1(b)(I)). Finally, he certainly will be assessed an additional two-level adjustment for sophisticated means. (See USSG § 2B1.1(b)(10)(C)). Therefore, his total adjusted offense level is 25. Walter has absolutely no criminal history, so his Criminal History Category is I. (See USSG § 4A1.1). So, if Walter goes to trial and is convicted, his advisory sentencing range will be 57 to 71 months. However, if he pleads guilty, he will receive a three-level downward adjustment for acceptance of responsibility, which will reduce his final offense level to 22 for an advisory sentencing range of 41 to 51 months. (See USSG § 3E1.1). So, on its face, by pleading guilty, Walter could reduce his sentencing exposure by 16 to 20 months off the advisory sentencing range.
But how likely is he to receive a sentence within either sentencing range? Below is a chart created using the Commission’s online Interactive Sourcebook, which demonstrates the overall compliance rate for USSG § 2B1.1. (Available at http://isb.ussc.gov/Login). As clearly seen, the within-Guideline compliance rate has dropped from 70 percent in 2006 to just over 40 percent in 2015. During that same period of time, the rate at which judges have sentenced below the advisory range, i.e., downward variances, has grown from about 12 percent to just over 30 percent. Likewise, downward departures pursuant to Government motions also have grown from about 12 percent to nearly 25 percent. Upward departures and variances have remained negligible.
Thus, Walter has a fairly good chance of receiving a sentence below the Guidelines range even if he is unable to offer any substantial assistance. But this does not answer what sentence can Walter expect. Below is a table of average (mean) and median sentences imposed in 2015 by specified Guidelines. For those sentenced under USSG § 2B1.1, the average sentence was 24 months and the median was 12 months.
Focusing on those individuals most similarly situated to Walter, i.e., sentenced between 2006 and 2015 under USSG § 2B1.1, in CHC I, with a final offense level of 25 (if going to trial) or 22 (if otherwise pleading guilty), and subject to no mandatory minimum penalty, reveals the following, filtered by those that went to trial and those that pleaded guilty. Walter now has a better, more informed look at what he might expect should he go to trial, or plead guilty. If he goes to trial, the average sentence of 44.3 months covering the period 2006-2015 is nearly 13 months below the minimum of the sentencing range, and the average sentence of 37.4 months for just 2015, is nearly 20 months below the bottom of the range. If he pleads guilty, the average sentence of 29.2 months is nearly 12 months below the minimum, and the average of 26.3 months for just 2015, is nearly 15 months below the bottom of the range.
Now, the government approaches Walter with an offer of 30 months if he decides to plead. He must agree to that sentence and will not be able to argue for anything less. Is that a good offer? As the above table notes, a 30-month sentence is slightly above the average sentence of 29.2 months for those who plead guilty but is below the median of 33 months. In looking at the data, a 30-month sentence is only in the 43.1st percentile, meaning it is greater than only 43.1 percent of all those in this class of offenders, but less than 56.9 percent of all such sentences. So, should he plead guilty pursuant to the government’s offer, he can roughly expect a typical result.
But Walter knows he almost certainly will receive a variance, so he now wants to focus on those who received variances: what was the typical variance and what sentences did that class of offenders receive that were similarly situated to him? As the table below shows, at the end of the day, going to trial on average yields a sentence of only 30.6 months for those who receive downward variances (which is most of the defendants in this class, i.e., 53.4 percent), whereas, if he pleads and receives a typical downward variance, he can expect to receive a sentence of 23.1 months. However, only 28.6 percent of those who plead guilty receive a non-government sponsored downward variance. Thus, to expect a result like this, he would have to plead straight up to the indictment, i.e., without benefit of a plea agreement, since the agreement binds him (and the court) to 30 months (which already is a downward variance from the Guidelines range).
Thus, in drilling down into the data, Walter now can make a far more informed decision about whether to go to trial versus plead guilty pursuant to the plea agreement, or even to plead straight up. Again, if Walter goes to trial and is convicted, his sentencing range would be 57 to 71 months, but, statistically speaking, it is more likely that his sentence will be in the neighborhood of 30.6 months, or nearly half the bottom of the range given both the likelihood of a downward variance and the degree of such a variance. In contrast, if Walter pleads guilty pursuant to the plea agreement, which requires a 30-month sentence, he can expect the same re-sult as if he were to go to trial and be convicted. But, by taking the plea agreement, he arguably is in a worse position than going to trial, for he will have waived his right to appeal and collaterally attack his conviction and sentence. So, that leaves pleading open or straight-up. Again, by pleading guilty his sentencing range will be 41 to 51 months, and those similarly situated to him are receiving on average a sentence of nearly half the bottom of the range at 23.1 months. Thus, Walter must decide, ultimately (and strictly statistically speaking), how much is 7.5 months’ worth? Put differently, statistically speaking, he is risking 7.5 months of additional time going to trial than if he were to plead guilty.
Walter decides to roll the dice and go to trial, for he figures the government isn’t really offering him anything he otherwise would not get if he goes to trial and is convicted. And by pleading open, he forfeits the chance at an acquittal as well as to litigate any trial errors, should he be convicted on any counts. Pursuant to his plea deal with the government, Jesse testifies against Walter at his trial. And, not too surprisingly,Walter is convicted on all counts. As Jesse’s testimony substantially assisted the government in prosecuting Walter, it dutifully moves for a downward departure pursuant to USSG § 5K1.1 and, despite the fact the Guidelines sentencing range for Jesse is 41 to 51 months, recommends one day of probation, three years’ supervised release, and restitution in the amount of $2 million. The district court grants the motion and sentences Jesse consistent with the government’s recommendation. Walter, luckily, has a very sympathetic probation officer, although he calculates Walter’s sentencing range to be 57 to 71 months (as expected), he nevertheless recommends a below-Guidelines sentence of only 15 months given Walter’s lack of any criminal history and less culpable role than Jesse. The government, too, is surprisingly sympathetic, but not as much. For its part, it recommends the same sentence that Walter would have received had he taken the plea agreement: 30 months. The figure below provides a comparison of these sentencing recommendations to select offense categories all in Criminal History Category I.
As is apparent, the bottom of the Guidelines range is most likely far too severe, and certainly not what the typical offender similarly situated to Walter receives. Indeed, it is greater than 57.6 percent of all similarly situated offenders. In fact, it is greater than even the average sentence imposed on drug-trafficking offenders generally, including those subject to mandatory minimum penalties and higher offense levels. While the government recommendation certainly is helpful and perhaps even generous given it is approximately the same as what Walter would have received had he pleaded guilty pursuant to an agreement, it still is higher than the average for assault cases generally, which includes those with higher offense levels. In this context, therefore, the PSR’s recommendation certainly appears quite lenient but still relatively serious. It is lenient relative to similarly situated offenders in that it is in the 11.5th percentile (meaning it is greater than only 11.5 percent of similarly situated offenders), but is still relatively serious in that it is only a little less than what the average fraud offender receives including those with higher offense levels. But Walter does not feel he should do any time, especially where the most culpable person was Jesse, and he received no time. Are there any individuals similarly situated to him that received probationary sentences? Indeed, there are. Of the 131 individuals sentenced under USSG § 2B1.1 between 2006 and 2015, who had a Final Offense Level of 25, were in CHC I and were not subject to any mandatory penalty, there were 15 individuals who received a year-and-a-day or less, of which one defendant received exactly one day, and six received a sentence of no prison whatsoever. In fact, nearly 30 percent of those sentenced for fraud offenses in 2015 received a non-custodial sentence or at least a split sentence. (See U.S. Sentencing Comm’n, 2015 Sourcebook on Federal Sentencing Statistics, Tbl. 12 (‘‘Offenders Receiving Sentencing Options in Each Primary Offense Category’’), available at http://www.ussc.gov/sites/default/files/pdf/research-andpublications/ annual-reports-and-sourcebooks/2015/ Table12.pdf.) Thus, while non-custodial sentences in fraud cases are unusual, they certainly are not unprecedented or even rare. As it turns out, the sentencing judge not only agrees with Walter that the Guidelines are calibrated far too high, and that a variance is warranted, but that any imprisonment is unwarranted. So, the judge imposes a sentence of one day imprisonment followed by three years’ supervised release and an order of full restitution to be paid jointly and severally with Jesse. And the government appeals claiming that such a significant downward variance is substantially unreasonable.
Stats in Real Life
Guess what? The above hypothetical is based on an actual case: United States v. Musgrave, 761 F.3d 602 (6th Cir. 2014) (Musgrave I). In MusgraveI, the defendant initially received a sentence of one day imprisonment (with credit for the day of processing), three years’ supervised release (without home confinement), and no fine. The government, not surprisingly, appealed. In reviewing the sentence for substantive reasonableness (i.e., for an abuse of discretion), the Sixth Circuit noted that ‘‘[a] sentence may be considered substantively unreasonable when the district court selects a sentence arbitrarily, bases the sentence on impermissible factors, fails to consider relevant sentencing factors, or gives an unreasonable amount of weight to any pertinent factor.’’ Musgrave, 761 F.3d at 608 (emphasis added; quoting United States v. Conatser, 514 F.3d 508, 520 (6th Cir.2008)). In Musgrave, the district court had cited the collateral consequences of Musgrave’s conviction as partial justification for the significant downward variances. (Musgrave, 761 F.3d at 608). ‘‘In imposing a sentence of one day with credit for the day of processing, the district court relied heavily on the fact that Musgrave had already ‘been punished extraordinarily’ by four years of legal proceedings, legal fees, the likely loss of his CPA license, and felony convictions that would follow him for the rest f his life.’’ (Id.) Accordingly, the Sixth Circuit remanded for resentencing inasmuch as reliance on collateral consequences was impermissible. (See id.) But in doing so, the Sixth Circuit was careful to state: ‘‘it bears repeating that ‘[w]hile appellate courts retain responsibility for identifying proper and improper sentencing considerations after Booker, it is not our task to impose sentences in the first instance or to second guess the individualized sentencing discretion of the district court when it appropriately relies on the § 3553(a) factors.’ ’’ (Id., quoting United States v. Davis, 537 F.3d 611, 618 (6th Cir. 2008)). On remand, the district court once again imposed a sentence of one day, but this time increased the term of supervised release from three years and no home confinement to five years’ supervised release with a condition of 24 months’ home confinement. The district court also now imposed a $250,000 fine where it previously had not imposed any. Nonetheless, focused on the non-custodial aspect of the sentence, the government again appealed. (See United States v. Musgrave, No. 15-3043, 1 (6th Cir. May 4, 2016) (unpublished)). (Musgrave II). This time, however, the Sixth Circuit affirmed the sentence. In so affirming, the Sixth Circuit observed the following: Based on the district court’s review of statistics and other cases, of all white-collar defendants in our circuit, nearly 30% receive no prison time, and approximately one-third of that 30% receive some form of home confinement instead. The government asserts that the district court should have limited its review to cases involving losses between $1 million and $2.5 million, where ‘‘nearly 90% of defendants were sentenced to an average of 40 months in prison.’’ But there is reason to believe that, because the loss Guidelines were not developed using an empirical approach based on data about past sentencing practices, it is particularly appropriate for variances. (Musgrave II, No.15-3043, at 15, emphasis added, citing inter alia, United States v. Corsey, 723 F.3d 366, 379 (2d Cir. 2013) (Underhill, J., concurring)); Mark H. Allenbaugh, Drawn from Nowhere’’: A Review of the U.S. Sentencing Commission’s White-Collar Sentencing Guidelines and Loss Data, 26 Fed. Sent’g Rep. 19, 19 (2013)). The Sixth Circuit also helpfully observed that ‘‘[a] sentence does not result in unwarranted disparities simply because it deviates from the average.’’ (Musgrave II, No.15-3043, at 16).
The Baby Blue hypothetical provides the reader with a walk-through of how to effectively use sentencing statistics from the plea/trial stage through sentencing, and even on appeal. The Musgrave I & II cases illustrate how they can be helpful in achieving even an unusual result. The use of such data and analyses thus are fundamentally important in the era of advisory Guidelines, especially where courts increasingly are varying from certain Guidelines at ever higher degrees of magnitude. The primary purpose of the Guidelines, after all, are to avoid unwarranted sentencing disparity. Thus, as the Guidelines are modeled after statistics, the use of statistics manifestly is necessary in order to avoid such disparities in advocating for the lowest sentence possible. But doing so will require a deep dig into the data.
As this article was going to press, the U.S. Court of Appeals for the Second Circuit issued a remarkable published opinion emphasizing the need for consulting and using statistics at sentencing. In United States v. Jenkins, No. 14-4295 (2d. Cir. April 17, 2017) (2-1), the Second Circuit reversed a within-guidelines sentence as substantively unreasonable where the court neglected to consult readily available sentencing statistics from the U.S. Sentencing Commission. Specifically, whether a within Guidelines sentence of 225 months in a child pornography possession case was substantively unreasonable. In reversing, the Second Circuit first noted the ‘‘irrationality in 2G2.2’’ by citing Commission statistics demonstrating that nearly all those sentenced under that Guideline received close to the statutory maximum penalty. Id. at *12. The Second Circuit then held that ‘‘[t]he sentence the district court imposed also created the type of unwarranted sentence disparity that violates § 3553(a)(6). Statistics from the Sentencing Commission validate our concern. . . . [T]he Commission’s statistics, which were readily available to the district court at the time of sentencing, allow for a meaningful comparison of Jenkins’s behavior to that of other child pornography offenders,’’ which plainly showed that the sentence was excessive and unwarrantedly disparate from other similarly situated offenders. (Id. at *19) (emphasis added; footnote omitted). The Jenkins holding naturally also suggests it would constitute ineffective assistance of counsel to neglect to consult and cite Commission statistics and data in appropriate cases.