disk IO request queue 2005-08-05 - By Magnus Andersen
Back I would modify the pagecache parameter down a little to see if it helps. My parameters are 1 2 8. Default in RHEL3 is 1 15 30. Also have you set all the parameters for memory in the sysctl file? You can look on www.puschitz.com to get the right values. The article is called Tuning and Optimizing RH and Oracle 9i. This article will tell you how to configure AIO and I believe that you can also look here to find out if your big pages are working.
On RHEL 3 you can cat /proc/slabinfo and look for the kioctx and kiob. If AIO is working it should look similar to this...
kioctx 1647 1650 128 55 55 1 : 1008 252 kiocb 72426 72930 128 2415 2431 1 : 1008 252
On a system where Async IO is not working or is disabled it looks more like this:
kioctx 0 0 256 0 0 1 : 252 63 kiocb 0 0 192 0 0 1 : 252 63
If you don't have AIO working you should re-compile oracle with that feature. I don't think the 4kb blocksize in ext3 is your problem.
Magnus
On 8/5/05, nasvel <nasvel@(protected)> wrote: > Magnus Andersen wrote: > > >A few more questions... > > > >1. What does you /etc/fstab look like? > > > [mrtg@(protected) mrtg]$ cat /etc/fstab > LABEL=/ / ext3 defaults 1 1 > LABEL=/boot /boot ext3 defaults 1 2 > none /dev/pts devpts gid=5,mode=620 0 0 > none /proc proc defaults 0 0 > LABEL=/u01 /u01 ext3 defaults 1 2 > LABEL=/u02 /u02 ext3 > defaults,noatime 1 2 > LABEL=/u03 /u03 ext3 > defaults,noatime 1 2 > LABEL=/u04 /u04 ext3 > defaults,noatime 1 2 > LABEL=/u05 /u05 ext3 > defaults,noatime 1 2 > LABEL=/u06 /u06 ext3 > defaults,noatime 1 2 > LABEL=/u07 /u07 ext3 > defaults,noatime 1 2 > LABEL=/u08 /u08 ext3 > defaults,noatime 1 2 > LABEL=/u09 /u09 ext3 > defaults,noatime 1 2 > LABEL=/u10 /u10 ext3 > defaults,noatime 1 2 > LABEL=/u11 /u11 ext3 > defaults,noatime 1 2 > LABEL=/var /var ext3 defaults 1 2 > /dev/cciss/c0d0p5 swap swap > defaults,pri=1 0 0 > /dev/cciss/c0d0p6 swap swap > defaults,pri=2 0 0 > /dev/cciss/c2d1p1 swap swap > defaults,pri=3 0 0 > /dev/cciss/c2d1p2 swap swap > defaults,pri=4 0 0 > /dev/cdrom /mnt/cdrom iso9660 > noauto,owner,kudzu,ro 0 0 > /dev/fd0 /mnt/floppy auto > noauto,owner,kudzu 0 0 > > >2. What does the output from a cat of /proc/sys/vm/pagecache look like? > > > [mrtg@(protected) mrtg]$ cat /proc/sys/vm/pagecache > 2 30 40 > > >3. What does the output from a cat of /proc/meminfo look like? > > > [mrtg@(protected) mrtg]$ cat /proc/meminfo > total: used: free: shared: buffers: cached: > Mem: 16555655168 16143187968 412467200 0 216465408 7584522240 > Swap: 4294819840 758366208 3536453632 > MemTotal: 16167632 kB > MemFree: 402800 kB > MemShared: 0 kB > Buffers: 211392 kB > Cached: 6684672 kB > SwapCached: 722088 kB > Active: 224888 kB > Inact_dirty: 784772 kB > Inact_clean: 6608492 kB > Inact_target: 2224276 kB > HighTotal: 15531996 kB > HighFree: 380036 kB > LowTotal: 635636 kB > LowFree: 22764 kB > SwapTotal: 4194160 kB > SwapFree: 3453568 kB > BigPagesFree: 90112 kB > > >4. Is kswapd/kscand processes running alot? > > > this is an extrait from sar, frankly, I don't know it sounds too many or > not for DB server who has 118 processes oracle running constantly. > > [mrtg@(protected) mrtg]$ sar -B > .... > 14:00:00 pgpgin/s pgpgout/s activepg inadtypg inaclnpg inatarpg > 14:10:01 12580,92 6087,17 51356 288124 1598086 556069 > 14:20:01 12869,03 7150,97 1358869 211496 357811 556069 > 14:30:00 10836,75 4273,70 39755 305119 1639621 556069 > 14:40:00 11875,15 2157,05 50592 272401 1662126 556069 > 14:50:01 10840,94 3495,95 5300 466818 1511166 556069 > 15:00:00 11218,28 6351,94 811833 260201 878285 556069 > 15:10:00 13609,75 8266,37 1621522 326428 83188 556069 > 15:20:00 12949,44 12996,30 1617815 349851 78578 556069 > 15:30:00 14690,19 11610,43 258207 1542660 240570 556069 > 15:40:00 12655,25 5044,63 21691 426909 1530368 556069 > 15:50:00 12884,14 2814,72 48942 262341 1650613 556069 > 16:00:01 12753,04 2805,44 384267 43088 1497163 556069 > 16:10:00 19176,36 7790,58 1092592 51285 843185 556069 > 16:20:00 3590138,12 9234,77 46522 1835686 156279 556069 > 16:30:01 13198,41 4914,28 13847 335651 1635201 556069 > 16:40:00 13635,15 3062,69 22390 297322 1649422 556069 > 16:50:00 13038,47 4228,73 606612 97051 1269245 556069 > 17:00:00 11530,39 4239,44 1310924 350514 273286 556069 > 17:10:00 15584,85 5776,15 47659 1587050 389788 556069 > 17:20:00 12372,65 6751,91 65247 330337 1585666 556069 > 17:30:00 12188,38 11084,01 1219729 219331 433911 556069 > 17:40:02 14959,71 12696,39 1547112 85828 312153 556069 > 17:50:01 11087,24 15645,09 1585365 321786 124828 556069 > 18:00:00 10330,70 15218,37 1587687 62260 358641 556069 > 18:10:02 9976,87 4021,09 68018 277566 1662482 556069 > 18:20:00 13582,79 4956,19 830875 136754 993023 556069 > 18:30:00 14311,43 6323,61 401770 104135 1449782 556069 > 18:40:00 12212,25 10072,07 1333636 283263 338959 556069 > 18:50:00 11533,66 8419,48 670885 81454 1206864 556069 > 19:00:02 11976,42 5428,17 57846 244240 1661806 556069 > 19:10:02 7181,63 890,05 60445 243424 1662280 556069 > 19:20:00 9276,39 3165,65 673303 716847 549284 556069 > 19:30:00 12674,96 7099,56 21974 1414586 485291 556069 > Moyenne: 41202,55 4322,33 284646 199764 1376526 556069 > > > > >I don't think there is a big difference between hugetlb and bigpages. > >I do know that I didn't have this implemented and I saw similar > >behavior. Since I implemented hugetlb my server has been running > >perfect. I also did not have a memory issue, but tuning the pagecache > >and bdflush vm parameters help my performance alot. > > > also, I found another thing which might cause the problem, but I'm not > very sure. I'm using the ext3 fs which has the blocksize as 4k, and the > DBA's configured the database blocksize as 8k. Do you think if that > could be the cause of the bottleneck of IO? (cause one read request > oracle will invoke two read() system. ) > > >Also, are you using AIO? > > > I don't know. I'll check it out. > > > > >Magnus > > > many thanks again! > > dux > > > > >On 8/5/05, nasvel <nasvel@(protected)> wrote: > > > >>Magnus Andersen wrote: > >> > >> > >>>This sounds very similar to what I experienced when I went live on a > >>>RHEL 3 / 9i environment. A couple of questions. > >>> > >>>1. How are the Oracle share mounted to the system? > >>> > >>> > >>> > >>For oracle, we've got 4 harddisk attached to a controller SCSI. We have > >>a big tablespace which is composed of 16 dbf files. And the 16 files is > >>spreading out on the first 3 disks, and the last disk we use to store > >>the index tablespace. > >> > >> > >>>2. Have you played with Linux vm? > >>> > >>> > >>> > >>We've tuned the shmmax and max open files. And we're not lack of memory, > >>there is 6G cached memory. > >>[mrtg@(protected) mrtg]$ free > >> total used free shared buffers cached > >>Mem: 16167632 16105196 62436 0 189340 6684672 > >>-/+ buffers/cache: 9231184 6936448 > >>Swap: 4194160 1196604 2997556 > >> > >> > >>>3. Are you using hugetlb? > >>> > >>> > >>> > >>no, because hugetlb is not available in AS2.1. But in AS2.1 the bigpages > >>is enabled. According oracle, there is no big diff between them. You > >>think it's important? > >> > >>http://www.oracle.com/technology/pub/notes/technote_rhel3.html > >> > >>Enterprise Linux 3 has replaced bigpages with a feature called hugetlb, > >>a backport of what is also in Linux kernel 2.6. There are a few > >>differences in how hugetlb works. Hugetlb behavior is similar to that of > >>bigpages; the pages are backed by large TLB entries, are not pageable, > >>and are preallocated, which means that once you allocate x megabytes of > >>hugetlb pages, that amount of physical memory can be used only through > >>hugetlbfs or shm allocated with SHM_HUGETLB. > >> > >>Thank you very much! > >> > >> > >>dux > >> > >> > >>>On 8/5/05, nasvel <nasvel@(protected)> wrote: > >>> > >>> > >>> > >>>>Hi, > >>>> > >>>>since some weeks our database server (redhat taroon + oracle 9i) > >>>>suffered from a very bad performance. The load avg climbed sometimes to > >>>>80% :(, althought I think I've a powerful machine (HP, 3 Intel Xeon with > >>>>16G memory). > >>>> > >>>>To try to find out the problem, I looked at the iostat report. I found > >>>>the await time are pretty high, and the average queue length is about > >>>>10. Someone told me that it is normal for a DB server, but I have some > >>>>doubt, so I would like to have you guy's opinions about that... > >>>> > >>>>any suggestion is welcome, thanks > >>>> > >>>>dux > >>>> > >>>>=== begin output === > >>>> > >>>>Linux 2.4.9-e.62enterprise 05.08.2005 > >>>> > >>>>cpu-moy: %user %nice %sys %idle > >>>> 17,63 0,02 12,31 70,04 > >>>> > >>>>Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz > >>>>await svctm %util > >>>>cciss/c1d1p1 > >>>> 889,26 11,04 515,44 13,11 623,61 193,26 1,55 10,62 > >>>>21,30 8,03 42,42 > >>>>cciss/c1d0p1 > >>>> 273,76 45,13 100,64 48,85 872,20 751,95 10,86 10,62 > >>>>114,52 26,39 39,45 > >>>>cciss/c1d2p1 > >>>> 198,26 108,73 107,90 27,83 326,31 1061,51 10,22 10,62 > >>>>100,75 26,34 35,76 > >>>>cciss/c1d3p1 > >>>> 233,38 29,04 89,97 30,70 463,79 477,95 7,80 8,29 > >>>>68,69 26,78 32,32 > >>>> > >>>>=== end output === > >>>> > >>>>-- > >>>>Taroon-list mailing list > >>>>Taroon-list@(protected) > >>>>http://www.redhat.com/mailman/listinfo/taroon-list > >>>> > >>>> > >>>> > >>>> > >>> > >>> > >>> > >>-- > >>Taroon-list mailing list > >>Taroon-list@(protected) > >>http://www.redhat.com/mailman/listinfo/taroon-list > >> > >> > > > > > > -- > Taroon-list mailing list > Taroon-list@(protected) > http://www.redhat.com/mailman/listinfo/taroon-list >
-- Magnus Andersen Systems Administrator / Oracle DBA Walker & Associates, Inc.
-- Taroon-list mailing list Taroon-list@(protected) http://www.redhat.com/mailman/listinfo/taroon-list
Earn $52 per hosting referral at Lunarpages.
|
|