mainboard: X8DTL-i
CPU:Intel(R) Xeon(R) CPU E5606 @ 2.13GHz
BIOS: X8DTL31.C30 – BIOS Revision: R 2.1a
OS: Centos 6.3 Latest
Kernel: 2.6.32-279.5.2.el6.x86_64
07:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
e1000e driver: latest from elrepo kmod-e1000e
modinfo e1000e
filename: /lib/modules/2.6.32-279.5.2.el6.x86_64/weak-updates/e1000e/e1000e.ko
version: 2.0.0-NAPI
license: GPL
description: Intel(R) PRO/1000 Network Driver
author: Intel Corporation, srcversion: BBDF1C9420EE194E4015419

with this new BIOS, there is an option in BIOS setup screen to completely disable ASPM

after this upgrade and adding below line to grub.conf

pcie_aspm=off e1000e.IntMode=1,1 e1000e.InterruptThrottleRate=10000,10000 acpi=off

My server is still online without any problems for the last 48 hours.

Before that eth0 and eth1 was crashing at random intervals.
The longest uptime before crashing was 23 hours.

it looks like problem solved for me with this upgrade…

taken from: http://www.doxer.org/learn-linux/resolved-intel-e1000e-driver-bug-on-82574l-ethernet-controller-causing-network-blipping/

Earlier I posted a question about centos 6.2 lost internet connections intermittently. Now finally I got the right way to fix this.

Firstly, this is a known bug on Intel e1000e driver on linux platforms. This is a driver problem with the Intel 82574L(MSI/MSI-X interrupts issue). The internet connection lost itself now and then and there’s nothing logged about this which is very bad for troubleshooting.
You can see more bug reporting about this at https://bugzilla.redhat.com/show_bug.cgi?id=632650

Fortunately, we can resolve this by install kmod-e1000e package from ELrepo.org. To solve this, you need do as the following(ignore lines with strikeouts):

Install kmod-e1000e offered by Elrepo

Import the public key:
rpm –import http://elrepo.org/RPM-GPG-KEY-elrepo.org

To install ELRepo for RHEL-5, SL-5 or CentOS-5:
rpm -Uvh http://elrepo.org/elrepo-release-5-3.el5.elrepo.noarch.rpm

To install ELRepo for RHEL-6, SL-6 or CentOS-6:
rpm -Uvh http://elrepo.org/elrepo-release-6-4.el6.elrepo.noarch.rpm

Before installing the new driver, let’s see our old one:
[root@doxer sites]# lspci |grep -i ethernet
02:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection

[root@doxer modprobe.d]# lsmod|grep e100
e1000e 219500 0

[root@doxer modprobe.d]# modinfo e1000e
filename: /lib/modules/2.6.32-220.7.1.el6.x86_64/kernel/drivers/net/e1000e/e1000e.ko
version: 1.4.4-k
license: GPL
description: Intel(R) PRO/1000 Network Driver
author: Intel Corporation, srcversion: 6BD7BCA22E0864D9C8B756A

Now let’s install the new kmod-e1000e offered by elrepo:
[root@doxer yum.repos.d]# yum list|grep -i e1000
kmod-e1000.x86_64 8.0.35-1.el6.elrepo elrepo
kmod-e1000e.x86_64 1.9.5-1.el6.elrepo elrepo

[root@doxer yum.repos.d]# yum -y install kmod-e1000e.x86_64

After installation, reboot your machine, and you’ll find driver updated:
[root@doxer ~]# modinfo e1000e
filename: /lib/modules/2.6.32-220.7.1.el6.x86_64/weak-updates/e1000e/e1000e.ko
version: 1.9.5-NAPI
license: GPL
description: Intel(R) PRO/1000 Network Driver
author: Intel Corporation, srcversion: 16A9E37B9207620F5453F5E

[root@doxer ~]# lsmod|grep e100
e1000e 229197 0

change kernel parameter

Append the following parameters to grub.conf kernel line:

pcie_aspm=off e1000e.IntMode=1,1 e1000e.InterruptThrottleRate=10000,10000 acpi=off

change NIC parameters(you should add these lines to /etc/rc.local)

#disable pause autonegotiate
/sbin/ethtool -A eth0 autoneg off
/sbin/ethtool -s eth0 autoneg off
#change tx ring buffer
/sbin/ethtool -G eth0 tx 4096 #maybe too large(consider 512). To increase interrupt rate, ethtool -C eth0 rx-usecs 10<10000 interrupts per second>
#change rx ring buffer
/sbin/ethtool -G eth0 rx 128
#disable wake on line
/sbin/ethtool -s eth0 wol d
#turn off offload
/sbin/ethtool -K eth0 tx off rx off sg off tso off gso off gro off
#enable TX pause
/sbin/ethtool -A eth0 tx on
#disable ASPM
/sbin/setpci -s 02:00.0 CAP_EXP+10.b=40
/sbin/setpci -s 00:19.0 CAP_EXP+10.b=40

PS:

pcie_aspm is abbr for Active-State Power Management. This is somehow related to powersaving mechanism, you can get more info here.
acpi is abbr for Advanced Configuration and Power Interface, you can refer to here
apic is abbr for Advanced Programmable Interrupt Controller, it’s somehow related to IRQ. apic is one kind of many PICs, intel and some other NICs have this feature. You can read more info about this here.

Now reboot your machine and you’re expected to have a more steady networking!

PS2:

The reason why there’s so much strikeouts in this article is that I’ve struggled a lot with this kernel bug. Firstly, I thought it’s caused by kernel bug of e1000e driver, and after some searching, I installed kmod-e1000e driver and modified the kernel parameter. Things turned better for a short time. Later, I found the issue was still there, so I tried compile the latest e1000e driver from intel. But neither this worked.

Later, I tried a script which monitored the networking of the time NIC went down. After the NIC failed for several times, I found that Tx traffic was so high each time NIC went to failure(TX bytes went up like 5Gb at a very short time). Based on this, I realized that there may be some DoS attack on the server. Using ntop & tcpdump, I found that DNS traffic was very large, but actually my host was not providing DNS services at all!

Then I wrote some iptable rules to disallow DNS queries etc, and after that, the host now is becoming steady again! Traffic went down as per normal, and everything is now on the track. I’m so happy and so excited about this as this is the first time I’ve stopped an DoS attack!

Anakart ustundeki fake raid ile kurulmus sozumona raid1 olan makinamizda
adam gibi mdamd software raid yapmak istiyoruz
anakart ustundeki fake raidi kapatinca inek centos kurulumu
Warning: Disks sda, sdb contains BIOS RAID metadata, but are not part of any recognized BIOS RAID sets. Ignoring disks sda, sdb.
diyor
bunun uzerine rescue modunda sistemi acip
fake raid metadatasini disklerden ucuyoruz
su sekilde:
dmraid -r -E /dev/sdX
oh misler

Block outgoing email for particular domain exim


touch /etc/blockedomains
echo \"domain.com\" >> /etc/blockeddomains

In WHM > Exim Configuration Editor > Advanced Editor, put the following in the topmost box:

domainlist blocked_domains = lsearch;/etc/blockeddomains

Locate the \"ROUTERS CONFIGURATION\" section, and right below these lines:

==============
democheck:
driver = redirect
require_files = \"+/etc/demouids\"
condition = \"${if eq {${lookup {$originator_uid} lsearch {/etc/demouids} {$value}}}{}{false}{true}}\"
allow_fail
data = :fail: demo accounts are not permitted to relay email
==============

Put the following lines:

reject_domains:

driver = redirect
# RBL Blacklist incoming hosts
domains = +blocked_domains
allow_fail
data = :fail: Connection rejected: SPAM source $domain is manually blacklisted.

Thats it.....


mysqlcheck -u da_admin -p --auto-repair --optimize --all-databases

veya

mysqlcheck -Aor -u root -p
A = all databases
o = optimize
r = repair

bun daha kolay yonetmi varmis 2024 30 eylulde ogrendim

/usr/bin/mysqlcheck -uda_admin -p`grep "^passwd=" /usr/local/directadmin/conf/mysql.conf | cut -d= -f2` --skip-write-binlog --optimize --all-in-1 --all-databases

This is a small tip, I like to use, as normally in recently installed Linux, vi is used as the default editor instead of vim or nano.

So, How to change the default crontab editor to vim or nano?

To change to vim or nano just run:

export EDITOR="/usr/bin/vim" ; crontab -e 

that is for vim, or.

export EDITOR="/usr/bin/nano" ; crontab -e 

OK, you are now ready to work with your favorite text editor, instead of the default.

TR NOT: bu asagidaki yazi yillardir ugrastigim herseyi kisaca ve ozce cok guzel anlatmis.

belki bir gun Turkceye ceviririm. Orjinal link pingzineden en asagida linki.

 

You get a call at 2 am in the morning. One of your servers with over 1000 shared accounts on them has gone down. You rush to the office (thank God it’s close to home) and find your support staff frantically working on the server and at the same time trying to field calls and emails from irate customers. After several tense moments, the cause is found. The load is very high, causing services to fail. Your support staff suggests a reboot instead of diagnosing the reason for high load. You say ok, go ahead, as long as the load comes back to normal and all services run normally. Reboot done, and the team spends the rest of the night replying to customers. Later, you have no clue why the load went up the way it did because there were no logs.

Downtime is serious. In this age of social networking on Twitter and Facebook, bad news flies really fast. This kind of negative publicity can seriously result in loss of reputation and customers within a single day. It’s no wonder, Hosts have to be on top of their business all day every day.

Downtime is a reality in the Hosting business. Do the math. Here are some commonly advertised service availability figures.

  •   99.9% availability equates to 8 hours and 45 minutes of downtime per year.
  •  99.99% availability equates to around 52 minutes of downtime per year.

Even the most reliable WebHost has 52 minutes of downtime in a year. This downtime can be a result of scheduled or unscheduled events or both. In this article, we will look at ways to deal with both types of events.

Dealing with scheduled downtime

Scheduled downtimes are a necessary part of server maintenance. A web host who regularly maintains the servers will reduce incidence of security vulnerabilities, increase performance and improve customer experience. A good host will have more scheduled downtimes than unscheduled downtimes.

The most important way of dealing with scheduled downtime is through “Proactive Communication”. In this type of communication, you let customers know about the downtime before they find out on their own. Sounds simple, isn’t it. The sad fact is that many Hosts do not follow it well enough. So lets see how this helps.

How does proactive communication help?

Proactive communication is a very useful method for customer retention during downtimes.

  •  Gives you time to let your customers know all the great benefits they can hope to get with the changes in the system.
  •  Reduces customer confusion
  •  Helps customers inform their customers of downtime
  •  Reduce flood of tickets during the downtime
  •  Customers appreciate that you let them in on your plans.

How to setup proactive communication

Before shooting all your customers an email, spend a few minutes deciding what you will tell them. A nicely formatted and complete email will reduce a lot of confusion and reduce the burden on your support team, especially when they are busy with the maintenance work. Here are some pointers.

What to tell your customers during scheduled downtime. Tell them…

  •  When the maintenance is scheduled (Exact date and time)
  •  How long maintenance will last (down to the minutes)
  •  What exactly will get disrupted (eg, web, mail etc)
  •  Reasons for maintenance
  •  Benefits to the customer once the maintenance is done
  •  How to contact support staff during maintenance (via email, forum etc)
  •  Alternative arrangements they can do, if any.

When to tell them

  •  At least one week prior to the event.
  •  Again, 24 hours before the event

How to tell them

  •  News section on website
  •  Email
  •  Social media (Twitter, Facebook)
  •  Forum or blog

Dealing with Unscheduled downtime

Unscheduled downtimes happen when something unexpected or untoward happens. The reasons for unscheduled downtimes could include sudden increases in traffic, hacking attempts, old software leading to exploited vulnerabilities, DOS attacks, spam resulting in flooding of the queues, even the occasional hardware failure. No wonder it is a nightmarish scenario to deal with at 2am in the morning.

So what can hosts do to prevent a massive downturn, in the event of a downtime? Simply follow the 2Ps.

  •  Prevent downtime
  •  Prepare for downtime

How to Prevent downtime

Wouldn’t you service your car periodically to prevent breakdowns and expensive repairs. The same way, a server is the engine on which your hosting business runs. An important way to prevent downtime is to maintain your server hardware and software periodically. This type of server administration is called Proactive server administration.

In proactive server administration, always start by first securing the server with at least these steps. Note that these methods should be performed by a trained professional.

  •  Make sure the software is all updated
  •  Configure a firewall and restrict access to critical ports
  •  Decide on minimum services and secure those services. Close unwanted services.
  •  If you have shared accounts, check user security such as weak passwords.
  •  Enable extended logging so that detecting during disaster is easier.
  •  Secure world writable directories.

Monitor availability of servers and individual services. For example, if your server load frequently goes high, you should be able to set up notifications that inform you of cut off load long before it becomes dangerously high. This helps you prevent downtime simply by checking on it, before the load creeps up and brings the server down.

It is always useful to log all information for critical services, and to set up notifications for certain events. This helps in debugging and preventing future downtimes. The scenario I presented in the beginning, could have been prevented if logs were maintained.

Keep track of exploits and service vulnerabilities. Sites like secunia.org and milw0rm.com have newsletters and mailing lists that you can sign on, thats gives you information first hand on any vulnerabilities. Take action before hackers do.

Also, always conduct a monthly server audit to check for any suspect logins, spamming, server performance etc.

How to prepare for downtime

The first step to prepare for downtime is to visualize your reaction if an unscheduled downtime took place.

How would you contact your customers? Is your infrastructure up to speed to deal with an emergency. For example a helpdesk system, your website, phone lines and email are critical systems that should be available to engage with your customers in times of downtime.

Some people wonder whether to communicate unscheduled downtime to customers. The downtime is going to last a few minutes. Should the host inform customers of unscheduled downtime?

And the answer is Yes!! The worst thing the host wants to do is to have customers find out by themselves, or worse, their customers. By being responsible and letting customers know, you seem to be on top of your business. Customers appreciate the fact that you informed them, rather than the other way around.

Always prepare to send a lightening response to customers who are experiencing downtime. Here are a few things you should prepare.

  1.   Speed of response. You need to put up information on your website within minutes of the downtime at least.
  2.  Decide where you are going to put up this information on the website. How you are going to contact your customers.
  3.  Many times you need professional help in solving downtime issues. Form those relationships early on, so that they are available when you need them.
  4.  If you have an in-house team, make sure they are ready and knowledgeable to solve these issues when they happen.

By prevention and careful preparedness, you can avoid downtimes taking a hit at your business and your customers’ businesses.

 

TAKEN FROM: http://www.pingzine.com/server-support-dealing-with-downtime-2/

Original link: http://lathama.net/Migrating_large_sparse_files_over_the_network

Migrating large sparse files over the network

Intro

When you need to move large sparse files across the network there are many issues related to support of this new FS method. Sparse files are files that say they are size X but only allocate blocks on the file system that are actually used. This is a great use of space and very nice for virtualization. In the past methods like COW to only use space as it was needed. These solutions worked. Sparse file support was integrated into the Linux Kernel and now it is the preferred way to handle images.

Problem

The need to move a large 100GB+ file from one server to another over the network. The file is sparse in nature which means that only a small portion of the file may actually be used. One does not want to transfer every byte of data and to fully allocate the file on the target system.

Solution

Use Tar with its support for Sparse and stdin and stdout. Tar checks the source file twice (normally and a second time for sparse) before streaming. On large files this can take time and processing power. The target file will be checked as it is written.

Requirements

Pipe Viewer will show us what is happening in the pipe. Without this you may go insane.

serverA:/# aptitude install pv
serverB:/# aptitude install pv

First you need to understand that Tar is going to look at the file TWICE. This will take lots of time and make you think nothing is happening. Wait, Wait, Wait and then smile. Select a port under 45,000 and above 1024 that is not in use by another service.

Example*

serverA:/# tar -cS IMG.img | pv -b | nc -n -q 15 172.20.2.3 5555
serverB:/# nc -n -l 5555 | pv -b | tar -xS

As an example here is another method. As with all SSH connections, it will cause 99% + CPU load for the duration of the connection even with compression off.

tar -cS IMG.img | pv -b | ssh -o 'Compression no' root@172.20.2.3 "cat > IMG.img.tar"

Then you need to extract the TAR image.

tar -xSf IMG.img.tar

Summary

There are other methods of completing this action. This method is the fastest that I have found. Using Rsync with Sparse options does work but it trasfers every null byte over the network, so it takes more time. It also runs two checksums on both source and target files. Further testing shows that compression can cause issues if one or both the servers are under load. This method can also be used over SSH or other authenticated protocols.

* This method has only hung once for me. If it causes you issues, wait for the connection to time out or test with another image.

Here is how you get vzdump on a clean version of CentOS (via the hostnode):

rpm -ivh "ftp://ftp.pbone.net/mirror/ftp.freshrpms.net/pub/freshrpms/pub/dag/redhat/el5/en/x86_64/RPMS.dag/cstream-2.7.4-3.el5.rf.x86_64.rpm"
wget http://dag.wieers.com/rpm/packages/perl-LockFile-Simple/perl-LockFile-Simple-0.206-1.el5.rf.noarch.rpm
rpm -ivh perl-LockFile-Simple-0.206-1.el5.rf.noarch.rpm
/bin/rm perl-LockFile-Simple-0.206-1.el5.rf.noarch.rpm
rpm -ivh "http://chrisschuld.com/centos54/vzdump-1.2-6.noarch.rpm"

Since version 1.2-6 of vzdump the location of the modules is not “automatic” and have found it necessary to export the location of the PVE libraries that vzdump requires via this command:

export PERL5LIB=/usr/share/perl5/

DONE! 🙂

vzdump , vzrestore …

———-

ONEMLI NOT:
YUKARIDAKILER CENTOS5 ICIN CALISIYOR
CENTOS 6.2 DE YAPTIGIMDA CORBA OLDU HERSEY
SU SEKILDE YAPILMASI GEREK CENTOS 6.2 DE

cd /tmp
wget http://pkgs.repoforge.org/cstream/cstream-2.7.4-3.el6.rf.i686.rpm
wget http://pkgs.repoforge.org/perl-LockFile-Simple/perl-LockFile-Simple-0.207-1.el6.rf.noarch.rpm
rpm -ivh cstream-2.7.4-3.el6.rf.i686.rpm
rpm -ivh perl-LockFile-Simple-0.207-1.el6.rf.noarch.rpm

rpm -ivh http://download.openvz.org/contrib/utils/vzdump/vzdump-1.2-4.noarch.rpm