A model estimates that COVID-19 cases in the U.S. are three times greater than reported

Global health experts have long suspected that the incidence of COVID-19 has been higher than reported. Now, a machine learning algorithm developed at UT Southwestern estimates that the number of COVID-19 cases in the U.S. since the outbreak began is nearly three times the number of confirmed cases.

The algorithm, described in a study published today in PLOS AON, providing daily updated estimates of total infections to date as well as how many people are currently across the U.S. and in 50 countries hardest hit by the pandemic.

As of Feb. 4, according to the model’s calculations, more than 71 million people in the U.S. – 21.5 percent of Americans – had contracted COVID-19. That compares with the number of publicly reported 26.7 million confirmed cases, said Jungsik Noh, Ph.D., assistant professor at UT Southwestern in Lyda Hill’s Department of Bioinformatics and the first author of the study.

Of those 71 million Americans assessed as having COVID-19, 7 million (2.1 percent of the U.S. population) had congenital diseases and could be infectious on Feb. 4, according to the algorithm.

Noh’s written study is based on calculations completed in September. At that time, he reports, the number of cumulative cases in 25 of the 50 hardest hit countries was five to 20 times greater than the numbers of confirmed cases then proposed.

Looking at the current information on the online algorithm, the estimates are now closer to the reported numbers – but still much higher. On February 4, Brazil had more than 36 million cumulative cases as estimated by the algorithm, almost four times more than the reported 9.4 million confirmed cases. France had 14 million compared to the reported 3.2 million. And the UK had almost 25 million instead of about 4 million – more than six times. There were nearly 15 times in Mexico, outlier, the number of reported cases – 27.6 million instead of 1.9 million confirmed cases.

“The estimates of real diseases are showing for the first time very real COVID-19 across the U.S. and in countries around the world,” Noh said.

The algorithm uses the number of reported deaths – which is thought to be more accurate and complete than the number of laboratory-confirmed cases – as the basis for the calculation. It then assumes a mortality rate of 0.66 percent, based on an earlier study of the pandemic in China, and considers other factors such as the average number of days from onset of symptoms to death or get over it. It also compares its estimate with the number of cases diagnosed to work out the ratio of confirmed to estimated diseases.

Many are still uncertain about COVID-19 – especially the mortality rate – so the estimates are rough, Noh says. But he believes the model’s estimates are more accurate and omit fewer cases than those currently proven as guidelines for public health policies. It is important to get a more complete estimate of the frequency of the disease, Noh adds.

“These are critical statistics about the severity of COVID-19 in each region. Knowing the true nature of the concern in different regions will help us effectively fight the spread of the virus,” he explains . “The current population is the cause of future diseases and deaths. Its size in an area is a variable change that is essential when determining the severity of COVID-19 and its ‘building strategies to combat regional outbreaks. “

In the U.S., disease rates vary widely by state. There have been nearly 7 million infections in California since the onset of the pandemic compared to 5.7 million in New York, according to algorithm predictions for Feb. 4. Also, the model evaluated by California on 1.3 million active cases on that date, affecting 3.4 percent of the state’s population. .

Other model estimates for Feb. 4: In Pennsylvania, 11.2 percent of the population had conventional diseases – the highest rate of any state, compared to a low of 0.15 percent of those living in Minnesota ; in New York, an early hot spot, 528,000 people, or about 2.7 percent of the population, had active diseases. Meanwhile, in Texas, 2.3 percent had conventional diseases.

Noh says he developed the algorithm last summer while trying to decide whether to send his sixth-grader back to school in person. There was no place to find the data he needed to make a safety case, he says.

Once he built the machine algorithm, he discovered that there was about a 1 percent normal disease rate in the area where he lived. So his daughter went to school.

Noh reviewed his findings by comparing his results with the frequency levels obtained in several studies that used blood tests to test for antibodies to the SARS-CoV-2 virus, causing COVID-19. For most of the domains tested, its algorithm estimates of diseases were closely related to the percentage of people who tested positive for the antibodies, according to the PLOS AON study.

The online model uses COVID-19 death data from Johns Hopkins University and the COVID Management Project, a volunteer group set up to help monitor COVID-19, to keep the updates up to date. he has to run. However, the estimates published in the PLOS AON study date from September 3. At that time, approximately 10 percent of the U.S. population had been infected with COVID-19, based on the Noh algorithm.

Source:

UT Southwestern Medical Center

.Source