Slow file reading




On Tue, 13 Oct 2009, Poul Riis wrote:

< I have a batch file as shown below. The file is under development and thus
< not finished yet.
< My problem is that a test run now has been running for two hours and still
< not finished.
< The data file is huge (once you get the permission you can find the data
< file "02_hit08.par" here:
< http://www.cfa.harvard.edu/HITRAN/HITRAN2008/HITRAN2008/By-Molecule/Uncompressed-files/).
< I attach a small test portion of the file.
< Can I make the data reading faster? Or is there a way to speed up
< calculations (I don't need exact numbers)? Or what else can I do better?
< 
< Yours,
< Poul Riis

Hi Paul,
A few comments:
-sum is a Maxima function, so it's a bad idea to assign to it (you may
end up with some difficult to find errors);

-your code reads a line at a time, which is almost always a bad idea;
it's better to read large chunks of data;

-eval_string/parse_string are last resorts, since they tend to be slow;
moreover, you are using them to parse your input, when there are better
options

-numericalio provides a facility to read large chunks of formatted
numeric data into a Maxima structure, I would recommend this.

-finally, your data file appears to contain poorly formatted data (some
numbers appear to run into others); I would use perl/awk/sed to  format 
the data correctly before reading it into maxima. Since you already know
the column numbers, this will be painless.

I hope this helps,
Leo

< 
< 
< 
< 
< atfil:openr("c:/atmodat/co2test.par")$
< c:299792458$
< k:1.3806504E-23$
< h:6.62606895E-34$
< g:9.82$
< p0:101325$
< Oj:40e6$
< Rj:Oj/%pi/2$
< T0:273.15$
< Tj:T0+15$
< kI:2*%pi*(k*Tj/(h*c))^4*h*c*c$
< I(z):=kI*z^3/(exp(z)-1)$
< U(x):=exp(-x)-x*incomplete_gamma(0,x)$
< NA:6.02214179e23$
< sigmab:5.6704E-8$
< sigma0:Mluft*g/(Na*alfa*p0)*10000$
< Mluft:0.029$
< alfa:385e-6$
< lambda0:h*c/(k*Tj)*100$
< sum:0$
< fleredata:true$
< ndat:1$
< linje:readline(datfil)$
< enoverlambdaold:eval_string(substring(linje,4,16))$
< sigmaold:eval_string(substring(linje,16,26))$
< zold:lambda0*enoverlambdaold$
< Izold:I(zold)$
< xnsigmaold:sigmaold/sigma0$
< undslipold:U(xnsigmaold)$
< print("sigma0 = ",sigma0," cm^2")$
< print("Tj     = ",Tj," K")$
< print("Ij     = ",sigmab*Tj^4," W/m^2")$
< print("alfa   = ",alfa)$
< while fleredata do
< 	block(linje:readline(datfil),
< 	if linje=false then 
< 		block(fleredata:false,print("slut"))
< 	else
< 		block(ndat:ndat+1,enoverlambda:eval_string(substring(linje,4,16)),sigma:eval_string(substring(linje,16,26)),
< 			z:lambda0*enoverlambda,Iz:I(z),
< 			xnsigma:sigma/sigma0,undslip:U(xnsigma),
< 			sum:sum+(Izold+Iz)*(z-zold)/2,
< 			enoverlambdaold:enoverlambda,sigmaold:sigma,Izold:Iz,undslipold:undslip))$
< print("Ther were ",ndat," data.")$
< 
< 

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.