E-mail Delivery Delays on svensbluff
Hi!
We’ve become aware of some issues with Mailscanner, which is causing the virus scanning system to “clog up” all incoming and outgoing e-mail. The result is just like a clogged pipe: because mails are being processed too slowly, mail is stacking up in the queue waiting for processing.
The part of Mailscanner which is having “issues” is the attachment scanning system. This is the part that scans all attachments (images and files) to see if they have viruses or worms embedded in them. We’ve confirmed that if we disable the attachment scanning system, mail shoots right on through at a normal pace.
Right now, we are trying to find a solution to this which does not require us to disable the virus/worm scanning system. While viruses and worms only account for 1% of the e-mail we see on a daily basis, we all know how destructive worms and viruses can be. Even though we make no guarantees or promises on the effectiveness of the virus scanning system, it is something that we feel is important to offer, and we do not want to disable this extra layer of protection unless there is absolutely no other way to restore normal service.
If you are looking for particular e-mails, please hang tight :D the e-mails are in the queue and will deliver as the queue flushes out over the next several hours. No e-mail has been deleted or lost, it’s just stacked up awaiting its turn to be processed.
In the meantime we will continue working urgently on this issue to get it resolved ASAP.
Thank you for your ongoing support and patience. It’s truly a pleasure to serve you!
Warmest Regards,
DCSN Team
Server ’svensbluff’ Drive Replacement
UPDATE, 6:32 PM: That was quick! The replacement was completed before we even had our original post up. All services are back up and running normally. ##
6:29 PM: Our shared hosting server ’svensbluff’ has been taken temporarily offline to have its backup drive replaced. This is not an OS-critical drive, nor is it a drive serving live sites, so extended downtime and/or data loss is not a concern. However the drive was failing, and we do need it to run scheduled back-ups, so it is being replaced before it fails completely and creates a snafu.
This is a scheduled replacement, and will take 5-10 minutes total, at which time the server will come back up.
Client data and live websites will be completely unaffected by this drive replacement. Any mail queued on the ‘net will remain queued until the server comes back up, at which time it will deliver according to the schedule of the sending mail server. There should be very little interruption as the outage is so short.
Thank you very much for your continued business and support. If we may answer any questions for you, please feel free to open a ticket with us, the link to the support desk (which is UP!!) is below or at the right. :D
DCSN Support Team
New Support Desk Online!!
Our new support desk is online and running! Please bookmark the new URL:
http://www.dcn911.com/support-center/
Or you can use the links at the right to get there.
If you have any questions or issues please do let us know right away. :D
Support System Issues
As many of our dear clients have pointed out :D our helpdesk at https://www.dcnsafe.com/tt/ is broken.
Rather than delay fixing our clients’ sites from the drive failure last week, we dropped our own site to last priority and put repairing client sites at first priority. We’ve been providing support via e-mail and that seems to have been working fairly well. Certainly better than that broken support desk!!!
Now that we have gone back to trying to fix the support desk, we have discovered that the database is missing over half our KB articles, and heaven knows how many tickets are missing. We have tried just to resolve the script errors, but it appears the script cannot be restored through a regular account restore process; SupportTrio seems to have some special system that prevents it from working on anything but an “original” installation. So we are left with no choice but to dump the script — we need one that works, not something that breaks whenever a stiff wind blows.
We are installing a brand-new desk (Cerberus) at https://www.dcn911.com/ so it will be totally off-server. This way if one of our shared servers goes down, the support site stays up. And, Cerberus is said to be a very stable solution. Although it does not offer all the capabilities that we want (specifically, it will not suggest KB solutions prior to submitting a ticket) it is better than the alternatives out there.
We hope to have the new support desk installed later this weekend and we will post prominent links to it off this site. We will also e-mail everyone once it is online so you will have the direct link for getting 24/7 online help at any time you need it.
Thank you for our patience and understanding as we’ve worked to get everything repaired and get you taken care of. Our new support site is the direct result of customer suggestions; I hope you find it helpful and useful. I am always grateful for customer suggestions on how we can do things better. Our job is to make your life easier. Please let me know when you have an idea how we can improve our service. :D
Warmest regards,
Karin
Agile Hosting/Door County Networking
eagleharbor kernel upgrade & reboot
8:49 P.M. UPDATE: The kernel upgrade was successful. The server has been upgraded and is back online running the new kernel.
Thank you and have a great evening.
DCSN Team
—@@—
8:44 P.M. NOTICE
Greetings!
We are upgrading the kernel on eagleharbor and will be rebooting the server shortly to make the new kernel take effect. This is a critical security upgrade as our current kernel is exploitable, which means your data is at risk.
In the event the kernel upgrade doesn’t go perfectly, the server may be offline for 20-30 minutes while data center techs look into it. Please bear with us. This is very important security work that must be done immediately — sort of like locking a door in a very dangerous neighborhood. We realize it might result in a short-term bump, but the pay-off is huge.
Thank you very much!!! :D
DCSN Team
eagleharbor rebooting
Update / 4:30 AM, December 1st: Great news: nearly all clients’ sites have been restored and are back online. According to our records and cross-checking, about 98% of clients are back up and running.
- We have run into issues with a couple of sites which our technicians are working on.
- We still need to recompile Apache/PHP to include extended modules and libraries. When we do this, we will also install Zend, for those who need Zend.
- SSL certificates and nameserver zonefiles will be installed as soon as possible.
- If your site should be on a dedicated IP address and is not, please let us know by simply posting a comment on this blog post — our technicians will check into it right away for you.
Equally important, in my opinion, is that this mess that we went through yesterday (which I will be happy to talk more about, it’s no secret) thankfully has attracted the attention of data center management. Interestingly, it took one supervisor about a half-hour Thursday evening to fix the remaining problems present that the “Level 2 Tech” working on our server had left over the course of 10 hours… including locating and racking a missing drive which we had been told repeatedly was in the server!
My friends, if you or I pulled this level of incompetence and complete disregard for our customers like we were subjected to yesterday by the data center, we’d be burned to the ground within the day. Needless to say neither you nor I run our businesses this way, which is why I am so upset about it. I know how much better it could be.
So at least I have gotten management’s attention. We shall see if it makes any difference. I am not willing to have servers at a facility where they aren’t assured to be well cared-for. If it wasn’t for you, my clients, I frankly would pack my bags and get the heck outta dodge after what we were just put through. But I recognize that we all need stability, so I am willing to go face-to-face, offer constructive feedback and see if we can’t make a positive out of a negative for the good of our current stability as well as the good of their collective customer base. That’s a long sentence saying, if everybody wins, it’s worth it.
The other great news is that we have lined up a new provider for our future server acquisitions. At this time, we anticipate that any new servers will be deployed with them. As that relationship comes to fruition we will tell you all about it. :D The company is run by folks I have known a very, very long time, is a top-quality dedicated server provider, and has a very solid business model.
Hope this update from the gal at the “top” helps to put your mind at ease and answer your questions. I’ll also be e-mailing out an update/newsletter over the weekend with a LOT more infomation (some related, some not) so please keep an eye out for that as well.
Any questions or issues, please let us know! Please click the Comment link and we will be in touch ASAP.
Karin
Agile Hosting/DCN
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Update / 2:56 PM: I can’t believe we are still dealing with this. :(
The data center technicians have finally brought the server back online with a fresh drive, but incredibly, they appear to have done so without also loading the failed & backup drives in the server… so we have, essentially, a blank server sitting there with no data to restore to it! — despite no less than four explicit requests that the drives be mounted as slaves so we could recover data. Needless to say, I am just this side of *irate*. Data center management have already been notified.
We will begin site restores as soon as the data center loads our drives back in the chassis, as originally requested.
Thank you for bearing with us. This situation has us re-analyzing our choice of data centers. This is not acceptable to us.
Karin
Agile Hosting/Door County Networking
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Update / 8:50 AM: The primary drive has failed, requiring us to put a brand-new drive in the server. Once the new drive has been installed, we will copy everyone’s data to the new drive and bring sites up one-at-a-time. We do not have an ETA for this process yet; at this moment, Level 2 Techs at the data center are still working on mounting a fresh drive in the chassis. As soon as the server is live on the new drive we will start work on bringing sites up. Thank you for your patience!!!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Good morning,
The server ‘eagleharbor’ went down at about 4:15 A.M. Central. We attempted to reboot it immediately. Upon reboot the operating system went into a forced FSCK (”file system check”), which is like a “defrag” in Windows-land. This is a normal Linux activity which is designed to protect data on a drive.
The FSCK completed once, then upon restart it went back into a forced FSCK again. (Forced meaning, the operating system will not allow the drive to boot up without it — this usually means there is major filesystem fragmentation present. Again, this is a protective measure.)
The server will come back up once the operating system has finished the required FSCK. We’ll keep you up-to-date with its progress.
Thank you!!!
DCSN Team
Apache & PHP upgrade in progress
1:09 PM Central: We are upgrading Apache & PHP on eagleharbor as we speak. The recompile should take about 10 minutes, and should result in little to no downtime. Usually just a momentary restart of the web server is all that’s required. This security upgrade will bring PHP to version 4.4.4. If you have any questions, please contact us via the support desk.
Thank you! :)
DCSN Team, serving Agile Hosting & Door County Networking
eagleharbor up, but domains offline
4:23 AM, Update, RESOLVED: named has been rebuilt and all domains are resolving normally again.
For those who are interested, this is a known bug with Redhat as well as within cPanel’s update system, and remains unfixed with both groups (circa 10 months since its first report). We’ve advised our entire technical staff of the fix so that if this occurs on any other systems they will be able to repair it quickly.
Thank you for your patience!
DCSN Team, serving
Agile Hosting & Door County Networking
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4:05 AM: A bad cPanel/OS update has caused the nameserver service to go down. The server is still up, which you can see by visiting your domain via:
http://72.36.234.122/~your_username <== insert your username at "your_username" -- be sure to include the tilde (~)
We are currently running a force-update of the operating system RPMs (the ones which didn't update properly). The nameserver service will come back up just as soon as they have downloaded and installed. The process should take about 15 minutes, barring unforseen circumstances.
If you have any questions or concerns please feel free to contact us on Skype (username: agilehosting) or send us an e-mail at baileyhost@gmail.com
Thank you!!!!
DCSN Team, serving
Agile Hosting & Door County Networking
eagleharbor.dcnsafe.com rebooting
The eagleharbor server went offline at 9:45 PM. We have put in a reboot request and the data center technicians are working on bringing the server up as we speak. We will post further updates as they become available.
DCSN Support Team
DOOR COUNTY NETWORKING
AGILE HOSTING
UPDATE 10:05 PM: The OS forced an FSCK prior to booting. The FSCK has completed and the server is back up. We are investigating why this crash occured and will do whatever we can to correct the issue so it does not occur again.
Thank you for your patience! Have a great evening.
DCSN Support Team
DOOR COUNTY NETWORKING
AGILE HOSTING
eagleharbor.dcnsafe.com rebooting
We are rebooting eagleharbor.dcnsafe.com to flush the memory, as part of security work which has just been completed. The server will be back up within a few minutes.
We are very sorry for any inconvenience this may cause.
Update: 10:23 PM Central, the server is already back online and all services are up and running.
Thank you for your patience! :)
DCSN Support Team
