Spring BatchでDBにデータを流し込んだ時の処理時間計測(Single/Multiスレッド)

先日、SpringBatchで複数ファイルをマルチスレッドでDBに突っ込むってエントリを書きましたが、そういえば、時間の計測してなかったと思って、チョロっとやってみました。＃クアッドコアのCPUとかだとこういうのグワっと差分が出てきそうですが、＃会社の開発用UbuntuマシンはPentium Dってヤツなので微妙っちゃ微妙ですが。。　 ■テストデータ準備 テストデータ準備用のプログラム

#!/usr/bin/ruby
start = ARGV[0].to_i
file_name = ARGV[1]
loop = ARGV[2].to_i
file = File.open(file_name, 'w')
loop.times{|i|
　source=("a".."z").to_a + ("A".."Z").to_a + (0..9).to_a # ランダム文字列用
　key=""
　5.times{key+=source[rand(source.size)].to_s}
　num = start + i
　file.puts num.to_s + "," + key
}
file.close

Multi用(10万行のファイル5つ) 　./input.rb 100000 hage1.csv 100000 　./input.rb 200000 hage2.csv 100000 　./input.rb 300000 hage3.csv 100000 　./input.rb 400000 hage4.csv 100000 　./input.rb 500000 hage5.csv 100000
シングル用(50万行のファイル1つ) 　./input.rb 500000 hage.csv 500000 　 ■ 処理 reader･･･ファイル読み込み processor･･･CSVの2要素目に"value"という文字列を追加してDBの3カラム目をセット writer･･･DBにINSERT(commit-interval="10") ・Singleは50万行のCSVファイル(hage.csv) ・Multiは5スレッドそれぞれで10万行のCSVファイル(hage1.csv〜hage5.csv) 　 ■ 結果 ★ Multi => 約88秒

2010-11-24 16:53:09,189 INFO main [org.springframework.context.support.ClassPathXmlApplicationContext] - <Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@3e86d0: startup date [Wed Nov 24 16:53:09 JST 2010]; root of context hierarchy>
　〜略〜
2010-11-24 16:54:37,925 INFO main [org.springframework.batch.core.launch.support.SimpleJobLauncher] - <Job: [FlowJob: [name=hageMultiJob]] completed with the following parameters: [{}] and the following status: [COMPLETED]>

　 ★ Single => 約140秒

2010-11-24 17:00:42,631 INFO [org.springframework.context.support.ClassPathXmlApplicationContext] - <Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@3e86d0: startup date [Wed Nov 24 17:00:42 JST 2010]; root of context hierarchy>
　〜略〜
2010-11-24 17:03:01,594 INFO [org.springframework.batch.core.launch.support.SimpleJobLauncher] - <Job: [FlowJob: [name=hageJob]] completed with the following parameters: [{}] and the following status: [COMPLETED]>

　　＝＝＝＝＝　　データ投入先のDBも同じマシン内なのでCPUネックになってしまってそこまで劇的な差を出すことは出来ませんでしたが、多重化して1.6倍くらいの性能が出てるので、計測したのは無駄ではなかったかなと。。

shinodoggのテキトーなブログ

テキトーです

Spring BatchでDBにデータを流し込んだ時の処理時間計測(Single/Multiスレッド)