UPDATE: Patch included below
There is a rather annoying bug floating around in FreeBSD 9.0 / 9.1 regarding network mubf leaking to the point of mbuf exhaustion.
These is a problem report (PR) filed about this from last year, but it looks to be abandoned.
http://www.freebsd.org/cgi/query-pr.cgi?pr=165903&cat=
There is a rather annoying bug floating around in FreeBSD 9.0 / 9.1 regarding network mubf leaking to the point of mbuf exhaustion.
These is a problem report (PR) filed about this from last year, but it looks to be abandoned.
http://www.freebsd.org/cgi/query-pr.cgi?pr=165903&cat=
I am experiencing
the same mbuf leak on fresh 9.1-RELEASE machines (AMD64). Most of my machines
are ESXi 5.1 VM's running the e1000 (em0) NIC. This VM is stock, just one
freebsd-update done, nothing custom.
I have also
experienced this condition on an older 9.0-STABLE from Jul 1st 2012. I did not
notice it much before that date, but I can't tell for sure. I have a few
machines on that build that I still use, so confirmation was easy.
I do not
experience the error if I load up vmware tools and use the vmx3f0 adapter, it's
just with em0.
I have set the
mbufs to very high numbers via sysctl kern.ipc.nmbclusters=322144 to buy more time between lockups/crashes.
Most often the systems stay functional, they just need a reboot or more mbufs
if I add them. Some times the servers lock up or crash as I ifconfig down/up the
adapter or attempt to add more mbufs via sysctl.
Is anyone else able to reproduce this?
I have attempted to update the PR or post to the list, but the freebsd.org server and my mail server no longer seem to get along. I'll have to troubleshoot that later this week.
UPDATE Apr 19th 2013:
Gleb Smirnoff was kind enough to quickly forward me a patch that fixed the problem for me. You will need to apply this to usr/src/sys/netinet/if_ether.c
I've now run for 2 days, and my mbufs have not increased at all. Thanks for the quick response from the FreeBSD-Stable list.
Index: if_ether.c
===================================================================
--- if_ether.c (revision 249327)
+++ if_ether.c (working copy)
@@ -558,13 +558,13 @@ in_arpinput(struct mbuf *m)
if (ah->ar_pln != sizeof(struct in_addr)) {
log(LOG_NOTICE, "in_arp: requested protocol length != %zu\n",
sizeof(struct in_addr));
- return;
+ goto drop;
}
if (allow_multicast == 0 && ETHER_IS_MULTICAST(ar_sha(ah))) {
log(LOG_NOTICE, "arp: %*D is multicast\n",
ifp->if_addrlen, (u_char *)ar_sha(ah), ":");
- return;
+ goto drop;
}
op = ntohs(ah->ar_op);
UPDATE Apr 19th 2013:
Gleb Smirnoff was kind enough to quickly forward me a patch that fixed the problem for me. You will need to apply this to usr/src/sys/netinet/if_ether.c
I've now run for 2 days, and my mbufs have not increased at all. Thanks for the quick response from the FreeBSD-Stable list.
Index: if_ether.c
===================================================================
--- if_ether.c (revision 249327)
+++ if_ether.c (working copy)
@@ -558,13 +558,13 @@ in_arpinput(struct mbuf *m)
if (ah->ar_pln != sizeof(struct in_addr)) {
log(LOG_NOTICE, "in_arp: requested protocol length != %zu\n",
sizeof(struct in_addr));
- return;
+ goto drop;
}
if (allow_multicast == 0 && ETHER_IS_MULTICAST(ar_sha(ah))) {
log(LOG_NOTICE, "arp: %*D is multicast\n",
ifp->if_addrlen, (u_char *)ar_sha(ah), ":");
- return;
+ goto drop;
}
op = ntohs(ah->ar_op);
Christopher - sorry to go off on a tangent...
ReplyDeleteYou posted a problem in the VMWare support forums that is *exactly* the same as I am having with DHCP broadcast packets not being passed from one host to another on the same dvPortGroup. I was just wondering if you ever found a solution. I don't believe it is a hardware problem. The URL is http://communities.vmware.com/message/2020958#2020958
Please followup if solved or e-mail me at maulermark@yahoo.com
Thanks!
Hi Mark,
ReplyDeleteThe problem for me was Dell hardware, and it was resolved.
I've updated my original post on vmware.com to reflect that fact.
Have fun tracking down the source.. it can take a while.. :-)
[Update]
ReplyDeleteThe above two comments are not specifically about the problem I posted here.
Concerning FreeBSD mbuf leak:
I'm installing a new 9.1-STABLE system to see if the problem (with FreeBSD) persists.
I should know in a few days, as the build process just completed.
I've got the same problem my firewall (pf + pfsync) on 9.0.
ReplyDeleteDriver: em(4) with chipset 82571EB.
netstat -m | grep allocated:
559467K/4059K/563526K bytes allocated to network (current/cache/total)
Olivier:
ReplyDeleteStay tuned, I'm testing a patch that I installed today that may do the trick.