Skip to main content
Topic: Viral statistics (Read 184 times) previous topic - next topic

Viral statistics

The media are full of stories about the new virus but the way the data of the disease are presented has never struck me as giving a clear message. Number of cases/deaths per day do not mean too much on their own (except of course to the families of the bereaved) but as sheer numbers, they don't really indicate the way things are going and whether things are getting better or worse. I have recently been discussing this with new member Paul Altobell who has worked professionally in the field of statistics and I'm glad to say that he agrees with me (or vice versa). I always felt that plotting the frequency coordinates in logarithmic form would give a much clearer idea.

However in today's Times, they have switched to doing just that, although the fact that it is a logarithmic scale is mentioned at the bottom of the diagram in vanishingly small print. The Financial Times is ackowledged ats data source. The other criticism I should make is that the freqency data are shown as absolute numbers, whereas it's clear that they have been normalised by dividing the numbers by the head of population (in millions? it doesn't say)

When you do this , comparing the plots for differenet countries, it's amazing how all the European and US data fall very closely on the same curve, at least initially. On the log. plot the curve is more linear and it enables one to predict by extrapolation how many deaths will result if nothing is done. We are still waiting to see if the lockdown here is having the desired effect. The Italian curve has definitely started to level out. It means of course that there will be many more deaths but at least the rate is falling off to one where the hositals can cope. Even Germany seems to follow the same curve but they are some days behind us.

The one outlier is of course  Korea,where their experience of SARS has enabled them to have the resources and techniques ready to throttle the progress of the disease from the start.

I finish by inlcude a couple acurves that Paul has got together and illustrate more clearly what I have been saying.  

Re: Viral statistics

Reply #1
The media are full of stories about the new virus but the way the data of the disease are presented has never struck me as giving a clear message. Number of cases/deaths per day do not mean too much on their own (except of course to the families of the bereaved) but as sheer numbers, they don't really indicate the way things are going and whether things are getting better or worse. I have recently been discussing this with new member Paul Altobell who has worked professionally in the field of statistics and I'm glad to say that he agrees with me (or vice versa). I always felt that plotting the frequency coordinates in logarithmic form would give a much clearer idea.

However in today's Times, they have switched to doing just that, although the fact that it is a logarithmic scale is mentioned at the bottom of the diagram in vanishingly small print. The Financial Times is ackowledged ats data source. The other criticism I should make is that the freqency data are shown as absolute numbers, whereas it's clear that they have been normalised by dividing the numbers by the head of population (in millions? it doesn't say)

When you do this , comparing the plots for differenet countries, it's amazing how all the European and US data fall very closely on the same curve, at least initially. On the log. plot the curve is more linear and it enables one to predict by extrapolation how many deaths will result if nothing is done. We are still waiting to see if the lockdown here is having the desired effect. The Italian curve has definitely started to level out. It means of course that there will be many more deaths but at least the rate is falling off to one where the hositals can cope. Even Germany seems to follow the same curve but they are some days behind us.

The one outlier is of course  Korea,where their experience of SARS has enabled them to have the resources and techniques ready to throttle the progress of the disease from the start.

I finish by inlcude a couple acurves that Paul has got together and illustrate more clearly what I have been saying.  

Re: Viral statistics

Reply #2
I remember a few years ago discovering that a normal distribution looked normal and worked when there was only one variable. However, if taking logs normalised the data, it meant there were two (or More) underlying causes that muliplied together.

So, if taking logs normalises th eCorvid-19 data, what are the two variables?

Re: Viral statistics

Reply #3
I don't quite follow the point about two variables being present.  In this case we are not so much dealing with a statistical distribution, it is really a plot of a time series. A linear plot would be perfectly justified here, but if we were interested in the rate of change; whether the rate is increasing or decreasing for instance, the use of a log scale on the y-axis scale gives a much clearer picture.  Extrapolation is also easier, (just extend the straight line into the future), but, as with any attempt to predict the future, it should be done with great care.

Re: Viral statistics

Reply #4
For those interested in information and data concerning covid-19 I can strongly recomend Full Fact.  They explain what the facts mean as they are issued and, where appropriate, expose the flaws in conclusions drawn from them.  These flaws are, I'm afraid, all too common at the moment.  Full Fact is a registered charity formed, I understand, by a group of statisticians to promote better understanding of figures from official and other sources. There are no charges, spam messages or adverts to worry about.  Just enter 'full fact' in Google and a further click will take you to their website.

Re: Viral statistics

Reply #5
I'm glad that my contribution has sparked off some response. To answer the comment left by our esteemed administrator, the so-called normal function that he refers to is probably the standard error curve, usually of Gaussian form and that gives a bell-shaped distribution curve. There are three parameters that governed peak height, standard deviation and of course , position of the peak. However, I don't think that the present curve of number of cases or deaths, against time will be of this form. The initial rise may look similar to the Gaussian curve but past the peak, there will surely be a long asymmetric "Tail" until the frequency falls effectively to zero when the pandemic has run its course.

I'm glad ppa agrees that the log plot gives a much quicker and clearer overview of the rate of infection. It's interesting that Saturday's Daily Telegraph had an article about how Sweden is dealing with the epidemic. The log plots for both Sweden and the UK run parallel, with that for Sweden somewhat below that for the UK. The extrapolation for the UK from the initial rate indicates that the  number of deaths would double every 2 days while that for Sweden was  for doubling every 3 days. Thank God, the actual rate is falling away from that in in the UK but we are still doubling the death count every 2.5 days at present.

The main thrust of the article was asking how long Sweden can keep up its policy of avoiding lockdown. Their government feels that Swedes live differently form other European nations and can be trusted to avoid unnecessary personal contact but it is under increasing pressure from statisticians mainly, to adopt more stringent measures. We shall see.

Re: Viral statistics

Reply #6
I have received some interesting information from Paul England of St Albans U3A to add to ppa's post. He says:

" Let me recommend you post the link:  https://jamanetwork.com/journals/jama/  for people to access.  It is to the Journal of the American Medical Association (JAMA), and gives links to lots of technical papers, interviews, etc.

On the right hand side of the homepage is a picture hyperlink to the Coronavirus Resource Center, from which you can access lots of articles and a number of really interesting interviews (35-45 min long) with medical and other experts.

I am registered with JAMA but I am not sure if this is essential, and anyway it was free to register.

Another Covid-19 resource is the Future Learn course on the virus.  This is a three module course, and can be viewed at https://www.futurelearn.com/courses/covid19-novel-coronavirus .  You definitely need to register with Future Learn to access the course, but again it is free."

 

Re: Viral statistics

Reply #7
Another point of interest. The Science and Technology editor of the Daily Telegraph had a piece in last Saturday's edition that was taken up again by the Times on Monday. It seems that Anglia Ruskin university has been looking into the connection between coronavirus and Vitamin D. She published an interesting graph of Vitamin D content for the general population for various countries plotted against cases of the virus per million of the inhabitants.

It shows a linear correlation, or maybe forced into one by linear regression, that shows that the vitamin does seem to have a beneficial effect in suppressing the number of cases. (The graph itself is credited to the University of East Anglia-maybe someone got their wires crossed.) The variance is pretty big but it is a generally linear correlation. Countries like Sweden and Slovakia with very high levels of vitamin D in their system do have relatively few cases of the disease. In the case of Sweden , it is surmised that although the sunlight is less intense, Swedes are an outdoor people and get in the sun as much as possible. On the other hand, in Spain which has a lot of sun, the inhabitants have the tradition of covering themselves up against it.

Interestingly, the UK although having less vitamin than other countries falls well below the regression line for the number of  cases. Ther does seem to me to be a conrsdiction here because ootherwise the course of the disease follows vert closely Italy, France and Spain. The real outlier though, is Iceland, which would appear to have nearly 4 times the number of cases than their vitamin level would predict. Very weird.

An interesting point that does come out of this rather loose correlation is that it may account for the abnormal sensitivity of black and ethnic minorities to the disease. Genetically, they are primed to make enough vitamin D in very strong sunshine. When they come to our cooler latitudes, they can no longer generate enough vitamin D for health. I have not yet seen this point discussed but it is an interesting theory.