commit b9a9cfdbf7254f4a231cc8ddf685cc29d3a9c6e5 Author: Sasha Levin Date: Fri Mar 4 16:55:54 2016 -0500 Linux 4.1.19 Signed-off-by: Sasha Levin commit 00731a62f0f167760ff7a536a30b9eb441794aaf Author: Neil Horman Date: Thu Feb 18 16:10:57 2016 -0500 sctp: Fix port hash table size computation [ Upstream commit d9749fb5942f51555dc9ce1ac0dbb1806960a975 ] Dmitry Vyukov noted recently that the sctp_port_hashtable had an error in its size computation, observing that the current method never guaranteed that the hashsize (measured in number of entries) would be a power of two, which the input hash function for that table requires. The root cause of the problem is that two values need to be computed (one, the allocation order of the storage requries, as passed to __get_free_pages, and two the number of entries for the hash table). Both need to be ^2, but for different reasons, and the existing code is simply computing one order value, and using it as the basis for both, which is wrong (i.e. it assumes that ((1< Reported-by: Dmitry Vyukov CC: Dmitry Vyukov CC: Vladislav Yasevich CC: "David S. Miller" Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit fdbc49cb20c7711e07010acd7bcf5e6839cf2a86 Author: Dmitry V. Levin Date: Fri Feb 19 04:27:48 2016 +0300 unix_diag: fix incorrect sign extension in unix_lookup_by_ino [ Upstream commit b5f0549231ffb025337be5a625b0ff9f52b016f0 ] The value passed by unix_diag_get_exact to unix_lookup_by_ino has type __u32, but unix_lookup_by_ino's argument ino has type int, which is not a problem yet. However, when ino is compared with sock_i_ino return value of type unsigned long, ino is sign extended to signed long, and this results to incorrect comparison on 64-bit architectures for inode numbers greater than INT_MAX. This bug was found by strace test suite. Fixes: 5d3cae8bc39d ("unix_diag: Dumping exact socket core") Signed-off-by: Dmitry V. Levin Acked-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 3aa450dcb11d582aab3a6aaf19b67380ab4322bc Author: Anton Protopopov Date: Tue Feb 16 21:43:16 2016 -0500 rtnl: RTM_GETNETCONF: fix wrong return value [ Upstream commit a97eb33ff225f34a8124774b3373fd244f0e83ce ] An error response from a RTM_GETNETCONF request can return the positive error value EINVAL in the struct nlmsgerr that can mislead userspace. Signed-off-by: Anton Protopopov Acked-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 1610a64d0c495cd84d848e5ff4e92f1f9f0ec6e5 Author: Xin Long Date: Thu Feb 18 21:21:19 2016 +0800 route: check and remove route cache when we get route [ Upstream commit deed49df7390d5239024199e249190328f1651e7 ] Since the gc of ipv4 route was removed, the route cached would has no chance to be removed, and even it has been timeout, it still could be used, cause no code to check it's expires. Fix this issue by checking and removing route cache when we get route. Signed-off-by: Xin Long Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit dda12d1bb3534b095c3d3e537e081339e0b7192f Author: Guillaume Nault Date: Mon Feb 15 17:01:10 2016 +0100 pppoe: fix reference counting in PPPoE proxy [ Upstream commit 29e73269aa4d36f92b35610c25f8b01c789b0dc8 ] Drop reference on the relay_po socket when __pppoe_xmit() succeeds. This is already handled correctly in the error path. Signed-off-by: Guillaume Nault Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 89afe37040ed95a8959d67efc784a27948f0a76f Author: Mark Tomlinson Date: Mon Feb 15 16:24:44 2016 +1300 l2tp: Fix error creating L2TP tunnels [ Upstream commit 853effc55b0f975abd6d318cca486a9c1b67e10f ] A previous commit (33f72e6) added notification via netlink for tunnels when created/modified/deleted. If the notification returned an error, this error was returned from the tunnel function. If there were no listeners, the error code ESRCH was returned, even though having no listeners is not an error. Other calls to this and other similar notification functions either ignore the error code, or filter ESRCH. This patch checks for ESRCH and does not flag this as an error. Reviewed-by: Hamish Martin Signed-off-by: Mark Tomlinson Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 29d526aca29faa3236c9787c2512c5ba4c82092a Author: Eugenia Emantayev Date: Wed Feb 17 17:24:27 2016 +0200 net/mlx4_en: Avoid changing dev->features directly in run-time [ Upstream commit 925ab1aa9394bbaeac47ee5b65d3fdf0fb8135cf ] It's forbidden to manually change dev->features in run-time. Currently, this is done in the driver to make sure that GSO_UDP_TUNNEL is advertized only when VXLAN tunnel is set. However, since the stack actually does features intersection with hw_enc_features, we can safely revert to advertizing features early when registering the netdevice. Fixes: f4a1edd56120 ('net/mlx4_en: Advertize encapsulation offloads [...]') Signed-off-by: Eugenia Emantayev Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 5c72a2dd6fedf29368fa67b15c7c076dd93cf3dd Author: Eugenia Emantayev Date: Wed Feb 17 17:24:23 2016 +0200 net/mlx4_en: Choose time-stamping shift value according to HW frequency [ Upstream commit 31c128b66e5b28f468076e4f3ca3025c35342041 ] Previously, the shift value used for time-stamping was constant and didn't depend on the HW chip frequency. Change that to take the frequency into account and calculate the maximal value in cycles per wraparound of ten seconds. This time slot was chosen since it gives a good accuracy in time synchronization. Algorithm for shift value calculation: * Round up the maximal value in cycles to nearest power of two * Calculate maximal multiplier by division of all 64 bits set to above result * Then, invert the function clocksource_khz2mult() to get the shift from maximal mult value Fixes: ec693d47010e ('net/mlx4_en: Add HW timestamping (TS) support') Signed-off-by: Eugenia Emantayev Reviewed-by: Matan Barak Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 7a9a967fa8bd958586896bb89f5d44ebbbbeaec3 Author: Amir Vadai Date: Wed Feb 17 17:24:22 2016 +0200 net/mlx4_en: Count HW buffer overrun only once [ Upstream commit 281e8b2fdf8e4ef366b899453cae50e09b577ada ] RdropOvflw counts overrun of HW buffer, therefore should be used for rx_fifo_errors only. Currently RdropOvflw counter is mistakenly also set into rx_missed_errors and rx_over_errors too, which makes the device total dropped packets accounting to show wrong results. Fix that. Use it for rx_fifo_errors only. Fixes: c27a02cd94d6 ('mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC') Signed-off-by: Amir Vadai Signed-off-by: Eugenia Emantayev Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ce643bc32f94c94d5d018c116ed0138ec19bfb4a Author: Bjørn Mork Date: Fri Feb 12 16:42:14 2016 +0100 qmi_wwan: add "4G LTE usb-modem U901" [ Upstream commit aac8d3c282e024c344c5b86dc1eab7af88bb9716 ] Thomas reports: T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=05c6 ProdID=6001 Rev=00.00 S: Manufacturer=USB Modem S: Product=USB Modem S: SerialNumber=1234567890ABCDEF C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage Reported-by: Thomas Schäfer Signed-off-by: Bjørn Mork Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 73fd505d34328b113fb9743688f245991409a830 Author: Rainer Weikusat Date: Thu Feb 11 19:37:27 2016 +0000 af_unix: Guard against other == sk in unix_dgram_sendmsg [ Upstream commit a5527dda344fff0514b7989ef7a755729769daa1 ] The unix_dgram_sendmsg routine use the following test if (unlikely(unix_peer(other) != sk && unix_recvq_full(other))) { to determine if sk and other are in an n:1 association (either established via connect or by using sendto to send messages to an unrelated socket identified by address). This isn't correct as the specified address could have been bound to the sending socket itself or because this socket could have been connected to itself by the time of the unix_peer_get but disconnected before the unix_state_lock(other). In both cases, the if-block would be entered despite other == sk which might either block the sender unintentionally or lead to trying to unlock the same spin lock twice for a non-blocking send. Add a other != sk check to guard against this. Fixes: 7d267278a9ec ("unix: avoid use-after-free in ep_remove_wait_queue") Reported-By: Philipp Hahn Signed-off-by: Rainer Weikusat Tested-by: Philipp Hahn Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ffe8615cbb6abdbb66b8c8b33a3e7ea06c6662f9 Author: Eric Dumazet Date: Thu Feb 4 06:23:28 2016 -0800 ipv4: fix memory leaks in ip_cmsg_send() callers [ Upstream commit 919483096bfe75dda338e98d56da91a263746a0a ] Dmitry reported memory leaks of IP options allocated in ip_cmsg_send() when/if this function returns an error. Callers are responsible for the freeing. Many thanks to Dmitry for the report and diagnostic. Reported-by: Dmitry Vyukov Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 54d0901de6ecdfc72514f5c610fe89d65674c126 Author: Jay Vosburgh Date: Tue Feb 2 13:35:56 2016 -0800 bonding: Fix ARP monitor validation [ Upstream commit 21a75f0915dde8674708b39abfcda113911c49b1 ] The current logic in bond_arp_rcv will accept an incoming ARP for validation if (a) the receiving slave is either "active" (which includes the currently active slave, or the current ARP slave) or, (b) there is a currently active slave, and it has received an ARP since it became active. For case (b), the receiving slave isn't the currently active slave, and is receiving the original broadcast ARP request, not an ARP reply from the target. This logic can fail if there is no currently active slave. In this situation, the ARP probe logic cycles through all slaves, assigning each in turn as the "current_arp_slave" for one arp_interval, then setting that one as "active," and sending an ARP probe from that slave. The current logic expects the ARP reply to arrive on the sending current_arp_slave, however, due to switch FDB updating delays, the reply may be directed to another slave. This can arise if the bonding slaves and switch are working, but the ARP target is not responding. When the ARP target recovers, a condition may result wherein the ARP target host replies faster than the switch can update its forwarding table, causing each ARP reply to be sent to the previous current_arp_slave. This will never pass the logic in bond_arp_rcv, as neither of the above conditions (a) or (b) are met. Some experimentation on a LAN shows ARP reply round trips in the 200 usec range, but my available switches never update their FDB in less than 4000 usec. This patch changes the logic in bond_arp_rcv to additionally accept an ARP reply for validation on any slave if there is a current ARP slave and it sent an ARP probe during the previous arp_interval. Fixes: aeea64ac717a ("bonding: don't trust arp requests unless active slave really works") Cc: Veaceslav Falico Cc: Andy Gospodarek Signed-off-by: Jay Vosburgh Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0f912f6700a3f14481c13cbda2b9cc1b636948ac Author: Daniel Borkmann Date: Wed Feb 10 16:47:11 2016 +0100 bpf: fix branch offset adjustment on backjumps after patching ctx expansion [ Upstream commit a1b14d27ed0965838350f1377ff97c93ee383492 ] When ctx access is used, the kernel often needs to expand/rewrite instructions, so after that patching, branch offsets have to be adjusted for both forward and backward jumps in the new eBPF program, but for backward jumps it fails to account the delta. Meaning, for example, if the expansion happens exactly on the insn that sits at the jump target, it doesn't fix up the back jump offset. Analysis on what the check in adjust_branches() is currently doing: /* adjust offset of jmps if necessary */ if (i < pos && i + insn->off + 1 > pos) insn->off += delta; else if (i > pos && i + insn->off + 1 < pos) insn->off -= delta; First condition (forward jumps): Before: After: insns[0] insns[0] insns[1] <--- i/insn insns[1] <--- i/insn insns[2] <--- pos insns[P] <--- pos insns[3] insns[P] `------| delta insns[4] <--- target_X insns[P] `-----| insns[5] insns[3] insns[4] <--- target_X insns[5] First case is if we cross pos-boundary and the jump instruction was before pos. This is handeled correctly. I.e. if i == pos, then this would mean our jump that we currently check was the patchlet itself that we just injected. Since such patchlets are self-contained and have no awareness of any insns before or after the patched one, the delta is correctly not adjusted. Also, for the second condition in case of i + insn->off + 1 == pos, means we jump to that newly patched instruction, so no offset adjustment are needed. That part is correct. Second condition (backward jumps): Before: After: insns[0] insns[0] insns[1] <--- target_X insns[1] <--- target_X insns[2] <--- pos <-- target_Y insns[P] <--- pos <-- target_Y insns[3] insns[P] `------| delta insns[4] <--- i/insn insns[P] `-----| insns[5] insns[3] insns[4] <--- i/insn insns[5] Second interesting case is where we cross pos-boundary and the jump instruction was after pos. Backward jump with i == pos would be impossible and pose a bug somewhere in the patchlet, so the first condition checking i > pos is okay only by itself. However, i + insn->off + 1 < pos does not always work as intended to trigger the adjustment. It works when jump targets would be far off where the delta wouldn't matter. But, for example, where the fixed insn->off before pointed to pos (target_Y), it now points to pos + delta, so that additional room needs to be taken into account for the check. This means that i) both tests here need to be adjusted into pos + delta, and ii) for the second condition, the test needs to be <= as pos itself can be a target in the backjump, too. Fixes: 9bac3d6d548e ("bpf: allow extended BPF programs access skb fields") Signed-off-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f68887de614bcf857651c0d2e24cfc3004dc20e4 Author: Alexander Duyck Date: Tue Feb 9 06:14:43 2016 -0800 net: Copy inner L3 and L4 headers as unaligned on GRE TEB [ Upstream commit 78565208d73ca9b654fb9a6b142214d52eeedfd1 ] This patch corrects the unaligned accesses seen on GRE TEB tunnels when generating hash keys. Specifically what this patch does is make it so that we force the use of skb_copy_bits when the GRE inner headers will be unaligned due to NET_IP_ALIGNED being a non-zero value. Signed-off-by: Alexander Duyck Acked-by: Tom Herbert Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 916f99656dc7d69355cb1045530c421cb1976590 Author: Alexander Duyck Date: Tue Feb 9 02:49:54 2016 -0800 flow_dissector: Fix unaligned access in __skb_flow_dissector when used by eth_get_headlen [ Upstream commit 461547f3158978c180d74484d58e82be9b8e7357, since we lack the flow dissector flags in this release we guard the flow label access using a test on 'skb' being NULL ] This patch fixes an issue with unaligned accesses when using eth_get_headlen on a page that was DMA aligned instead of being IP aligned. The fact is when trying to check the length we don't need to be looking at the flow label so we can reorder the checks to first check if we are supposed to gather the flow label and then make the call to actually get it. v2: Updated path so that either STOP_AT_FLOW_LABEL or KEY_FLOW_LABEL can cause us to check for the flow label. Reported-by: Sowmini Varadhan Signed-off-by: Alexander Duyck Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit bb6f76ba4c6b29e4650310982dc61639d34fa833 Author: Xin Long Date: Wed Feb 3 23:33:30 2016 +0800 sctp: translate network order to host order when users get a hmacid [ Upstream commit 7a84bd46647ff181eb2659fdc99590e6f16e501d ] Commit ed5a377d87dc ("sctp: translate host order to network order when setting a hmacid") corrected the hmacid byte-order when setting a hmacid. but the same issue also exists on getting a hmacid. We fix it by changing hmacids to host order when users get them with getsockopt. Fixes: Commit ed5a377d87dc ("sctp: translate host order to network order when setting a hmacid") Signed-off-by: Xin Long Acked-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit beb9881442593e9de581bf7d5396788cfedcdbd3 Author: Siva Reddy Kallam Date: Wed Feb 3 14:09:38 2016 +0530 tg3: Fix for tg3 transmit queue 0 timed out when too many gso_segs [ Upstream commit b7d987295c74500b733a0ba07f9a9bcc4074fa83 ] tg3_tso_bug() can hit a condition where the entire tx ring is not big enough to segment the GSO packet. For example, if MSS is very small, gso_segs can exceed the tx ring size. When we hit the condition, it will cause tx timeout. tg3_tso_bug() is called to handle TSO and DMA hardware bugs. For TSO bugs, if tg3_tso_bug() cannot succeed, we have to drop the packet. For DMA bugs, we can still fall back to linearize the SKB and let the hardware transmit the TSO packet. This patch adds a function tg3_tso_bug_gso_check() to check if there are enough tx descriptors for GSO before calling tg3_tso_bug(). The caller will then handle the error appropriately - drop or lineraize the SKB. v2: Corrected patch description to avoid confusion. Signed-off-by: Siva Reddy Kallam Signed-off-by: Michael Chan Acked-by: Prashant Sreedharan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 3faf465445a1d8677b860eee894aecc3d2e32cc6 Author: Hans Westgaard Ry Date: Wed Feb 3 09:26:57 2016 +0100 net:Add sysctl_max_skb_frags [ Upstream commit 5f74f82ea34c0da80ea0b49192bb5ea06e063593 ] Devices may have limits on the number of fragments in an skb they support. Current codebase uses a constant as maximum for number of fragments one skb can hold and use. When enabling scatter/gather and running traffic with many small messages the codebase uses the maximum number of fragments and may thereby violate the max for certain devices. The patch introduces a global variable as max number of fragments. Signed-off-by: Hans Westgaard Ry Reviewed-by: Håkon Bugge Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 797c009c98dbb21127a2549d1106ed19d18661cf Author: Hannes Frederic Sowa Date: Wed Feb 3 02:11:03 2016 +0100 unix: correctly track in-flight fds in sending process user_struct [ Upstream commit 415e3d3e90ce9e18727e8843ae343eda5a58fad6 ] The commit referenced in the Fixes tag incorrectly accounted the number of in-flight fds over a unix domain socket to the original opener of the file-descriptor. This allows another process to arbitrary deplete the original file-openers resource limit for the maximum of open files. Instead the sending processes and its struct cred should be credited. To do so, we add a reference counted struct user_struct pointer to the scm_fp_list and use it to account for the number of inflight unix fds. Fixes: 712f4aad406bb1 ("unix: properly account for FDs passed over unix sockets") Reported-by: David Herrmann Cc: David Herrmann Cc: Willy Tarreau Cc: Linus Torvalds Suggested-by: Linus Torvalds Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6c92a8f0502d3e9a9036473408f7f67b58f22ea9 Author: Eric Dumazet Date: Tue Feb 2 17:55:01 2016 -0800 ipv6: fix a lockdep splat [ Upstream commit 44c3d0c1c0a880354e9de5d94175742e2c7c9683 ] Silence lockdep false positive about rcu_dereference() being used in the wrong context. First one should use rcu_dereference_protected() as we own the spinlock. Second one should be a normal assignation, as no barrier is needed. Fixes: 18367681a10bd ("ipv6 flowlabel: Convert np->ipv6_fl_list to RCU.") Reported-by: Dave Jones Signed-off-by: Eric Dumazet Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 9f88ecf6c707ef3a6e26a8bd58e948728d2093d6 Author: subashab@codeaurora.org Date: Tue Feb 2 02:11:10 2016 +0000 ipv6: addrconf: Fix recursive spin lock call [ Upstream commit 16186a82de1fdd868255448274e64ae2616e2640 ] A rcu stall with the following backtrace was seen on a system with forwarding, optimistic_dad and use_optimistic set. To reproduce, set these flags and allow ipv6 autoconf. This occurs because the device write_lock is acquired while already holding the read_lock. Back trace below - INFO: rcu_preempt self-detected stall on CPU { 1} (t=2100 jiffies g=3992 c=3991 q=4471) <6> Task dump for CPU 1: <2> kworker/1:0 R running task 12168 15 2 0x00000002 <2> Workqueue: ipv6_addrconf addrconf_dad_work <6> Call trace: <2> [] el1_irq+0x68/0xdc <2> [] _raw_write_lock_bh+0x20/0x30 <2> [] __ipv6_dev_ac_inc+0x64/0x1b4 <2> [] addrconf_join_anycast+0x9c/0xc4 <2> [] __ipv6_ifa_notify+0x160/0x29c <2> [] ipv6_ifa_notify+0x50/0x70 <2> [] addrconf_dad_work+0x314/0x334 <2> [] process_one_work+0x244/0x3fc <2> [] worker_thread+0x2f8/0x418 <2> [] kthread+0xe0/0xec v2: do addrconf_dad_kick inside read lock and then acquire write lock for ipv6_ifa_notify as suggested by Eric Fixes: 7fd2561e4ebdd ("net: ipv6: Add a sysctl to make optimistic addresses useful candidates") Cc: Eric Dumazet Cc: Erik Kline Cc: Hannes Frederic Sowa Signed-off-by: Subash Abhinov Kasiviswanathan Acked-by: Hannes Frederic Sowa Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit d07f3017baa86b3e44eff41a41e3b297604ac8fb Author: Hangbin Liu Date: Thu Jul 30 14:28:42 2015 +0800 net/ipv6: add sysctl option accept_ra_min_hop_limit [ Upstream commit 8013d1d7eafb0589ca766db6b74026f76b7f5cb4 ] Commit 6fd99094de2b ("ipv6: Don't reduce hop limit for an interface") disabled accept hop limit from RA if it is smaller than the current hop limit for security stuff. But this behavior kind of break the RFC definition. RFC 4861, 6.3.4. Processing Received Router Advertisements A Router Advertisement field (e.g., Cur Hop Limit, Reachable Time, and Retrans Timer) may contain a value denoting that it is unspecified. In such cases, the parameter should be ignored and the host should continue using whatever value it is already using. If the received Cur Hop Limit value is non-zero, the host SHOULD set its CurHopLimit variable to the received value. So add sysctl option accept_ra_min_hop_limit to let user choose the minimum hop limit value they can accept from RA. And set default to 1 to meet RFC standards. Signed-off-by: Hangbin Liu Acked-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit d04aa950dbd249222f87c96fac9c547890524f5a Author: Paolo Abeni Date: Fri Jan 29 12:30:20 2016 +0100 ipv6/udp: use sticky pktinfo egress ifindex on connect() [ Upstream commit 1cdda91871470f15e79375991bd2eddc6e86ddb1 ] Currently, the egress interface index specified via IPV6_PKTINFO is ignored by __ip6_datagram_connect(), so that RFC 3542 section 6.7 can be subverted when the user space application calls connect() before sendmsg(). Fix it by initializing properly flowi6_oif in connect() before performing the route lookup. Signed-off-by: Paolo Abeni Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0c6943f31317d02178f55b882e36e03f4e5795f7 Author: Paolo Abeni Date: Fri Jan 29 12:30:19 2016 +0100 ipv6: enforce flowi6_oif usage in ip6_dst_lookup_tail() [ Upstream commit 6f21c96a78b835259546d8f3fb4edff0f651d478 ] The current implementation of ip6_dst_lookup_tail basically ignore the egress ifindex match: if the saddr is set, ip6_route_output() purposefully ignores flowi6_oif, due to the commit d46a9d678e4c ("net: ipv6: Dont add RT6_LOOKUP_F_IFACE flag if saddr set"), if the saddr is 'any' the first route lookup in ip6_dst_lookup_tail fails, but upon failure a second lookup will be performed with saddr set, thus ignoring the ifindex constraint. This commit adds an output route lookup function variant, which allows the caller to specify lookup flags, and modify ip6_dst_lookup_tail() to enforce the ifindex match on the second lookup via said helper. ip6_route_output() becames now a static inline function build on top of ip6_route_output_flags(); as a side effect, out-of-tree modules need now a GPL license to access the output route lookup functionality. Signed-off-by: Paolo Abeni Acked-by: Hannes Frederic Sowa Acked-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 1987c92ad0381beab46a65f79827d650de533584 Author: Eric Dumazet Date: Wed Jan 27 10:52:43 2016 -0800 tcp: beware of alignments in tcp_get_info() [ Upstream commit ff5d749772018602c47509bdc0093ff72acd82ec ] With some combinations of user provided flags in netlink command, it is possible to call tcp_get_info() with a buffer that is not 8-bytes aligned. It does matter on some arches, so we need to use put_unaligned() to store the u64 fields. Current iproute2 package does not trigger this particular issue. Fixes: 0df48c26d841 ("tcp: add tcpi_bytes_acked to tcp_info") Fixes: 977cb0ecf82e ("tcp: add pacing_rate information into tcp_info") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4c62ddf7e39943b356157b4c2196a71825c8328c Author: Ido Schimmel Date: Wed Jan 27 15:16:43 2016 +0100 switchdev: Require RTNL mutex to be held when sending FDB notifications [ Upstream commit 4f2c6ae5c64c353fb1b0425e4747e5603feadba1 ] When switchdev drivers process FDB notifications from the underlying device they resolve the netdev to which the entry points to and notify the bridge using the switchdev notifier. However, since the RTNL mutex is not held there is nothing preventing the netdev from disappearing in the middle, which will cause br_switchdev_event() to dereference a non-existing netdev. Make switchdev drivers hold the lock at the beginning of the notification processing session and release it once it ends, after notifying the bridge. Also, remove switchdev_mutex and fdb_lock, as they are no longer needed when RTNL mutex is held. Fixes: 03bf0c281234 ("switchdev: introduce switchdev notifier") Signed-off-by: Ido Schimmel Signed-off-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 257a04782588b4b97ac005a044f85947f792be44 Author: Parthasarathy Bhuvaragan Date: Wed Jan 27 11:35:59 2016 +0100 tipc: fix connection abort during subscription cancel [ Upstream commit 4d5cfcba2f6ec494d8810b9e3c0a7b06255c8067 ] In 'commit 7fe8097cef5f ("tipc: fix nullpointer bug when subscribing to events")', we terminate the connection if the subscription creation fails. In the same commit, the subscription creation result was based on the value of the subscription pointer (set in the function) instead of the return code. Unfortunately, the same function tipc_subscrp_create() handles subscription cancel request. For a subscription cancellation request, the subscription pointer cannot be set. Thus if a subscriber has several subscriptions and cancels any of them, the connection is terminated. In this commit, we terminate the connection based on the return value of tipc_subscrp_create(). Fixes: commit 7fe8097cef5f ("tipc: fix nullpointer bug when subscribing to events") Reviewed-by: Jon Maloy Signed-off-by: Parthasarathy Bhuvaragan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6e13e7b0183e0b9945ff350bd5fad573c2ece83c Author: Marcelo Ricardo Leitner Date: Fri Jan 22 18:29:49 2016 -0200 sctp: allow setting SCTP_SACK_IMMEDIATELY by the application [ Upstream commit 27f7ed2b11d42ab6d796e96533c2076ec220affc ] This patch extends commit b93d6471748d ("sctp: implement the sender side for SACK-IMMEDIATELY extension") as it didn't white list SCTP_SACK_IMMEDIATELY on sctp_msghdr_parse(), causing it to be understood as an invalid flag and returning -EINVAL to the application. Note that the actual handling of the flag is already there in sctp_datamsg_from_user(). https://tools.ietf.org/html/rfc7053#section-7 Fixes: b93d6471748d ("sctp: implement the sender side for SACK-IMMEDIATELY extension") Signed-off-by: Marcelo Ricardo Leitner Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ce28c3ced53aa6385eafeabaeb9c70eca9e3b1ba Author: Hannes Frederic Sowa Date: Fri Jan 22 01:39:43 2016 +0100 pptp: fix illegal memory access caused by multiple bind()s [ Upstream commit 9a368aff9cb370298fa02feeffa861f2db497c18 ] Several times already this has been reported as kasan reports caused by syzkaller and trinity and people always looked at RCU races, but it is much more simple. :) In case we bind a pptp socket multiple times, we simply add it to the callid_sock list but don't remove the old binding. Thus the old socket stays in the bucket with unused call_id indexes and doesn't get cleaned up. This causes various forms of kasan reports which were hard to pinpoint. Simply don't allow multiple binds and correct error handling in pptp_bind. Also keep sk_state bits in place in pptp_connect. Fixes: 00959ade36acad ("PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)") Cc: Dmitry Kozlov Cc: Sasha Levin Cc: Dmitry Vyukov Reported-by: Dmitry Vyukov Cc: Dave Jones Reported-by: Dave Jones Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 8d988538da0c17711c0de0a53fc38cef49e3ed1b Author: Eric Dumazet Date: Sun Jan 24 13:53:50 2016 -0800 af_unix: fix struct pid memory leak [ Upstream commit fa0dc04df259ba2df3ce1920e9690c7842f8fa4b ] Dmitry reported a struct pid leak detected by a syzkaller program. Bug happens in unix_stream_recvmsg() when we break the loop when a signal is pending, without properly releasing scm. Fixes: b3ca9b02b007 ("net: fix multithreaded signal handling in unix recv routines") Reported-by: Dmitry Vyukov Signed-off-by: Eric Dumazet Cc: Rainer Weikusat Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 93493f5aaff542f0d542c48a3fda2dea1afe5e9e Author: Eric Dumazet Date: Thu Jan 21 08:02:54 2016 -0800 tcp: fix NULL deref in tcp_v4_send_ack() [ Upstream commit e62a123b8ef7c5dc4db2c16383d506860ad21b47 ] Neal reported crashes with this stack trace : RIP: 0010:[] tcp_v4_send_ack+0x41/0x20f ... CR2: 0000000000000018 CR3: 000000044005c000 CR4: 00000000001427e0 ... [] tcp_v4_reqsk_send_ack+0xa5/0xb4 [] tcp_check_req+0x2ea/0x3e0 [] tcp_rcv_state_process+0x850/0x2500 [] tcp_v4_do_rcv+0x141/0x330 [] sk_backlog_rcv+0x21/0x30 [] tcp_recvmsg+0x75d/0xf90 [] inet_recvmsg+0x80/0xa0 [] sock_aio_read+0xee/0x110 [] do_sync_read+0x6f/0xa0 [] SyS_read+0x1e1/0x290 [] system_call_fastpath+0x16/0x1b The problem here is the skb we provide to tcp_v4_send_ack() had to be parked in the backlog of a new TCP fastopen child because this child was owned by the user at the time an out of window packet arrived. Before queuing a packet, TCP has to set skb->dev to NULL as the device could disappear before packet is removed from the queue. Fix this issue by using the net pointer provided by the socket (being a timewait or a request socket). IPv6 is immune to the bug : tcp_v6_send_response() already gets the net pointer from the socket if provided. Fixes: 168a8f58059a ("tcp: TCP Fast Open Server - main code path") Reported-by: Neal Cardwell Signed-off-by: Eric Dumazet Cc: Jerry Chu Cc: Yuchung Cheng Acked-by: Neal Cardwell Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0c0c5254583eb9a77e62a4391094258907e08ea1 Author: Manfred Rudigier Date: Wed Jan 20 11:22:28 2016 +0100 net: dp83640: Fix tx timestamp overflow handling. [ Upstream commit 81e8f2e930fe76b9814c71b9d87c30760b5eb705 ] PHY status frames are not reliable, the PHY may not be able to send them during heavy receive traffic. This overflow condition is signaled by the PHY in the next status frame, but the driver did not make use of it. Instead it always reported wrong tx timestamps to user space after an overflow happened because it assigned newly received tx timestamps to old packets in the queue. This commit fixes this issue by clearing the tx timestamp queue every time an overflow happens, so that no timestamps are delivered for overflow packets. This way time stamping will continue correctly after an overflow. Signed-off-by: Manfred Rudigier Acked-by: Richard Cochran Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 257ed6d440536351b9ceca7005ee89fbcc352b75 Author: Ursula Braun Date: Tue Jan 19 10:41:33 2016 +0100 af_iucv: Validate socket address length in iucv_sock_bind() [ Upstream commit 52a82e23b9f2a9e1d429c5207f8575784290d008 ] Signed-off-by: Ursula Braun Reported-by: Dmitry Vyukov Reviewed-by: Evgeny Cherkashin Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 35cc814663e4f3515fb6bae8a3b5c188aa2b6f89 Author: Gavin Shan Date: Tue Mar 1 16:02:43 2016 +1100 powerpc/eeh: Fix build error caused by pci_dn eeh.h could be included when we have following condition. Then we run into build error as below: (CONFIG_PPC64 && !CONFIG_EEH) || (!CONFIG_PPC64 && !CONFIG_EEH) In file included from arch/powerpc/kernel/of_platform.c:30:0: ./arch/powerpc/include/asm/eeh.h:344:48: error: ‘struct pci_dn’ \ declared inside parameter list [-Werror] : In file included from arch/powerpc/mm/hash_utils_64.c:49:0: ./arch/powerpc/include/asm/eeh.h:344:48: error: ‘struct pci_dn’ \ declared inside parameter list [-Werror] This fixes the issue by replacing those empty inline functions with macro so that we don't rely on @pci_dn when CONFIG_EEH is disabled. Cc: stable@vger.kernel.org # v4.1+ Fixes: ff57b45 ("powerpc/eeh: Do probe on pci_dn") Reported-by: Guenter Roeck Signed-off-by: Gavin Shan Signed-off-by: Sasha Levin commit 62adae8f26b918f4403129e355d81301a629b6a2 Author: Jan Kara Date: Fri Feb 19 00:33:21 2016 -0500 ext4: fix crashes in dioread_nolock mode [ Upstream commit 74dae4278546b897eb81784fdfcce872ddd8b2b8 ] Competing overwrite DIO in dioread_nolock mode will just overwrite pointer to io_end in the inode. This may result in data corruption or extent conversion happening from IO completion interrupt because we don't properly set buffer_defer_completion() when unlocked DIO races with locked DIO to unwritten extent. Since unlocked DIO doesn't need io_end for anything, just avoid allocating it and corrupting pointer from inode for locked DIO. A cleaner fix would be to avoid these games with io_end pointer from the inode but that requires more intrusive changes so we leave that for later. Cc: stable@vger.kernel.org Signed-off-by: Jan Kara Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin commit 5d0e8394db90caa6d06477f68cbdb48fe65f468d Author: Kirill A. Shutemov Date: Wed Feb 17 13:11:35 2016 -0800 ipc/shm: handle removed segments gracefully in shm_mmap() [ Upstream commit 15db15e2f10ae12d021c9a2e9edd8a03b9238551 ] commit 1ac0b6dec656f3f78d1c3dd216fad84cb4d0a01e upstream. remap_file_pages(2) emulation can reach file which represents removed IPC ID as long as a memory segment is mapped. It breaks expectations of IPC subsystem. Test case (rewritten to be more human readable, originally autogenerated by syzkaller[1]): #define _GNU_SOURCE #include #include #include #include #define PAGE_SIZE 4096 int main() { int id; void *p; id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0); p = shmat(id, NULL, 0); shmctl(id, IPC_RMID, NULL); remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0); return 0; } The patch changes shm_mmap() and code around shm_lock() to propagate locking error back to caller of shm_mmap(). [1] http://github.com/google/syzkaller Signed-off-by: Kirill A. Shutemov Reported-by: Dmitry Vyukov Cc: Davidlohr Bueso Cc: Manfred Spraul Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 6e82212c489fcdc50446bff12bd1ed5e8ef110a2 Author: Davidlohr Bueso Date: Wed Sep 9 15:39:20 2015 -0700 ipc: convert invalid scenarios to use WARN_ON [ Upstream commit d0edd8528362c07216498340e928159510595e7b ] Considering Linus' past rants about the (ab)use of BUG in the kernel, I took a look at how we deal with such calls in ipc. Given that any errors or corruption in ipc code are most likely contained within the set of processes participating in the broken mechanisms, there aren't really many strong fatal system failure scenarios that would require a BUG call. Also, if something is seriously wrong, ipc might not be the place for such a BUG either. 1. For example, recently, a customer hit one of these BUG_ONs in shm after failing shm_lock(). A busted ID imho does not merit a BUG_ON, and WARN would have been better. 2. MSG_COPY functionality of posix msgrcv(2) for checkpoint/restore. I don't see how we can hit this anyway -- at least it should be IS_ERR. The 'copy' arg from do_msgrcv is always set by calling prepare_copy() first and foremost. We could also probably drop this check altogether. Either way, it does not merit a BUG_ON. 3. No ->fault() callback for the fs getting the corresponding page -- seems selfish to make the system unusable. Signed-off-by: Davidlohr Bueso Cc: Manfred Spraul Cc: Linus Torvalds Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 0eba52da056a131ae7d5715e9bae4e147356d3d9 Author: Davidlohr Bueso Date: Tue Jun 30 14:58:36 2015 -0700 ipc,shm: move BUG_ON check into shm_lock [ Upstream commit c5c8975b2eb4eb7604e8ce4f762987f56d2a96a2 ] Upon every shm_lock call, we BUG_ON if an error was returned, indicating racing either in idr or in shm_destroy. Move this logic into the locking. [akpm@linux-foundation.org: simplify code] Signed-off-by: Davidlohr Bueso Cc: Manfred Spraul Cc: Davidlohr Bueso Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit aff5e5984051fb83578ef2a30db98747a6e39fa6 Author: Kirill A. Shutemov Date: Wed Feb 17 13:11:15 2016 -0800 mm: fix regression in remap_file_pages() emulation [ Upstream commit 48f7df329474b49d83d0dffec1b6186647f11976 ] Grazvydas Ignotas has reported a regression in remap_file_pages() emulation. Testcase: #define _GNU_SOURCE #include #include #include #include #define SIZE (4096 * 3) int main(int argc, char **argv) { unsigned long *p; long i; p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); if (p == MAP_FAILED) { perror("mmap"); return -1; } for (i = 0; i < SIZE / 4096; i++) p[i * 4096 / sizeof(*p)] = i; if (remap_file_pages(p, 4096, 0, 1, 0)) { perror("remap_file_pages"); return -1; } if (remap_file_pages(p, 4096 * 2, 0, 1, 0)) { perror("remap_file_pages"); return -1; } assert(p[0] == 1); munmap(p, SIZE); return 0; } The second remap_file_pages() fails with -EINVAL. The reason is that remap_file_pages() emulation assumes that the target vma covers whole area we want to over map. That assumption is broken by first remap_file_pages() call: it split the area into two vma. The solution is to check next adjacent vmas, if they map the same file with the same flags. Fixes: c8d78c1823f4 ("mm: replace remap_file_pages() syscall with emulation") Signed-off-by: Kirill A. Shutemov Reported-by: Grazvydas Ignotas Tested-by: Grazvydas Ignotas Cc: [4.0+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit cab48a095169b3f981c600b4365170a076ac9f3a Author: Takashi Iwai Date: Wed Feb 17 14:30:26 2016 +0100 ALSA: pcm: Fix rwsem deadlock for non-atomic PCM stream [ Upstream commit 67ec1072b053c15564e6090ab30127895dc77a89 ] A non-atomic PCM stream may take snd_pcm_link_rwsem rw semaphore twice in the same code path, e.g. one in snd_pcm_action_nonatomic() and another in snd_pcm_stream_lock(). Usually this is OK, but when a write lock is issued between these two read locks, the problem happens: the write lock is blocked due to the first reade lock, and the second read lock is also blocked by the write lock. This eventually deadlocks. The reason is the way rwsem manages waiters; it's queued like FIFO, so even if the writer itself doesn't take the lock yet, it blocks all the waiters (including reads) queued after it. As a workaround, in this patch, we replace the standard down_write() with an spinning loop. This is far from optimal, but it's good enough, as the spinning time is supposed to be relatively short for normal PCM operations, and the code paths requiring the write lock aren't called so often. Reported-by: Vinod Koul Tested-by: Ramesh Babu Cc: # v3.18+ Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 468eeda0837d3e4f1574f800dc460d85e6b49fb7 Author: Toshi Kani Date: Wed Feb 17 18:16:54 2016 -0700 x86/mm: Fix vmalloc_fault() to handle large pages properly [ Upstream commit f4eafd8bcd5229e998aa252627703b8462c3b90f ] A kernel page fault oops with the callstack below was observed when a read syscall was made to a pmem device after a huge amount (>512GB) of vmalloc ranges was allocated by ioremap() on a x86_64 system: BUG: unable to handle kernel paging request at ffff880840000ff8 IP: vmalloc_fault+0x1be/0x300 PGD c7f03a067 PUD 0 Oops: 0000 [#1] SM Call Trace: __do_page_fault+0x285/0x3e0 do_page_fault+0x2f/0x80 ? put_prev_entity+0x35/0x7a0 page_fault+0x28/0x30 ? memcpy_erms+0x6/0x10 ? schedule+0x35/0x80 ? pmem_rw_bytes+0x6a/0x190 [nd_pmem] ? schedule_timeout+0x183/0x240 btt_log_read+0x63/0x140 [nd_btt] : ? __symbol_put+0x60/0x60 ? kernel_read+0x50/0x80 SyS_finit_module+0xb9/0xf0 entry_SYSCALL_64_fastpath+0x1a/0xa4 Since v4.1, ioremap() supports large page (pud/pmd) mappings in x86_64 and PAE. vmalloc_fault() however assumes that the vmalloc range is limited to pte mappings. vmalloc faults do not normally happen in ioremap'd ranges since ioremap() sets up the kernel page tables, which are shared by user processes. pgd_ctor() sets the kernel's PGD entries to user's during fork(). When allocation of the vmalloc ranges crosses a 512GB boundary, ioremap() allocates a new pud table and updates the kernel PGD entry to point it. If user process's PGD entry does not have this update yet, a read/write syscall to the range will cause a vmalloc fault, which hits the Oops above as it does not handle a large page properly. Following changes are made to vmalloc_fault(). 64-bit: - No change for the PGD sync operation as it handles large pages already. - Add pud_huge() and pmd_huge() to the validation code to handle large pages. - Change pud_page_vaddr() to pud_pfn() since an ioremap range is not directly mapped (while the if-statement still works with a bogus addr). - Change pmd_page() to pmd_pfn() since an ioremap range is not backed by struct page (while the if-statement still works with a bogus addr). 32-bit: - No change for the sync operation since the index3 PGD entry covers the entire vmalloc range, which is always valid. (A separate change to sync PGD entry is necessary if this memory layout is changed regardless of the page size.) - Add pmd_huge() to the validation code to handle large pages. This is for completeness since vmalloc_fault() won't happen in ioremap'd ranges as its PGD entry is always valid. Reported-by: Henning Schild Signed-off-by: Toshi Kani Acked-by: Borislav Petkov Cc: # 4.1+ Cc: Andrew Morton Cc: Andy Lutomirski Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Luis R. Rodriguez Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Toshi Kani Cc: linux-mm@kvack.org Cc: linux-nvdimm@lists.01.org Link: http://lkml.kernel.org/r/1455758214-24623-1-git-send-email-toshi.kani@hpe.com Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 1d1338cfb7c94ffa41b3839a31c5ae332315131e Author: Gerd Hoffmann Date: Tue Feb 16 14:25:00 2016 +0100 drm/qxl: use kmalloc_array to alloc reloc_info in qxl_process_single_command [ Upstream commit 34855706c30d52b0a744da44348b5d1cc39fbe51 ] This avoids integer overflows on 32bit machines when calculating reloc_info size, as reported by Alan Cox. Cc: stable@vger.kernel.org Cc: gnomes@lxorguk.ukuu.org.uk Signed-off-by: Gerd Hoffmann Reviewed-by: Daniel Vetter Signed-off-by: Dave Airlie Signed-off-by: Sasha Levin commit 1a82e899582acee9df55a271fb8d6bb73de49fea Author: Rasmus Villemoes Date: Mon Feb 15 19:41:47 2016 +0100 drm/radeon: use post-decrement in error handling [ Upstream commit bc3f5d8c4ca01555820617eb3b6c0857e4df710d ] We need to use post-decrement to get the pci_map_page undone also for i==0, and to avoid some very unpleasant behaviour if pci_map_page failed already at i==0. Reviewed-by: Christian König Signed-off-by: Rasmus Villemoes Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit ad4ad1ac3b0aef2779097a298ecc20ef9eef17d6 Author: Takashi Iwai Date: Tue Feb 16 14:15:59 2016 +0100 ALSA: seq: Fix double port list deletion [ Upstream commit 13d5e5d4725c64ec06040d636832e78453f477b7 ] The commit [7f0973e973cd: ALSA: seq: Fix lockdep warnings due to double mutex locks] split the management of two linked lists (source and destination) into two individual calls for avoiding the AB/BA deadlock. However, this may leave the possible double deletion of one of two lists when the counterpart is being deleted concurrently. It ends up with a list corruption, as revealed by syzkaller fuzzer. This patch fixes it by checking the list emptiness and skipping the deletion and the following process. BugLink: http://lkml.kernel.org/r/CACT4Y+bay9qsrz6dQu31EcGaH9XwfW7o3oBzSQUG9fMszoh=Sg@mail.gmail.com Fixes: 7f0973e973cd ('ALSA: seq: Fix lockdep warnings due to 'double mutex locks) Reported-by: Dmitry Vyukov Tested-by: Dmitry Vyukov Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit a77b1f3d1928a8c31ee0e088e52404331aed5dd5 Author: Arnd Bergmann Date: Fri Feb 12 22:26:42 2016 +0100 tracing: Fix freak link error caused by branch tracer [ Upstream commit b33c8ff4431a343561e2319f17c14286f2aa52e2 ] In my randconfig tests, I came across a bug that involves several components: * gcc-4.9 through at least 5.3 * CONFIG_GCOV_PROFILE_ALL enabling -fprofile-arcs for all files * CONFIG_PROFILE_ALL_BRANCHES overriding every if() * The optimized implementation of do_div() that tries to replace a library call with an division by multiplication * code in drivers/media/dvb-frontends/zl10353.c doing u32 adc_clock = 450560; /* 45.056 MHz */ if (state->config.adc_clock) adc_clock = state->config.adc_clock; do_div(value, adc_clock); In this case, gcc fails to determine whether the divisor in do_div() is __builtin_constant_p(). In particular, it concludes that __builtin_constant_p(adc_clock) is false, while __builtin_constant_p(!!adc_clock) is true. That in turn throws off the logic in do_div() that also uses __builtin_constant_p(), and instead of picking either the constant- optimized division, and the code in ilog2() that uses __builtin_constant_p() to figure out whether it knows the answer at compile time. The result is a link error from failing to find multiple symbols that should never have been called based on the __builtin_constant_p(): dvb-frontends/zl10353.c:138: undefined reference to `____ilog2_NaN' dvb-frontends/zl10353.c:138: undefined reference to `__aeabi_uldivmod' ERROR: "____ilog2_NaN" [drivers/media/dvb-frontends/zl10353.ko] undefined! ERROR: "__aeabi_uldivmod" [drivers/media/dvb-frontends/zl10353.ko] undefined! This patch avoids the problem by changing __trace_if() to check whether the condition is known at compile-time to be nonzero, rather than checking whether it is actually a constant. I see this one link error in roughly one out of 1600 randconfig builds on ARM, and the patch fixes all known instances. Link: http://lkml.kernel.org/r/1455312410-1058841-1-git-send-email-arnd@arndb.de Acked-by: Nicolas Pitre Fixes: ab3c9c686e22 ("branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y") Cc: stable@vger.kernel.org # v2.6.30+ Signed-off-by: Arnd Bergmann Signed-off-by: Steven Rostedt Signed-off-by: Sasha Levin commit 999c2ceac0bcd8cb773e627151e4afaefe653427 Author: Steven Rostedt (Red Hat) Date: Mon Feb 15 12:36:14 2016 -0500 tracepoints: Do not trace when cpu is offline [ Upstream commit f37755490fe9bf76f6ba1d8c6591745d3574a6a6 ] The tracepoint infrastructure uses RCU sched protection to enable and disable tracepoints safely. There are some instances where tracepoints are used in infrastructure code (like kfree()) that get called after a CPU is going offline, and perhaps when it is coming back online but hasn't been registered yet. This can probuce the following warning: [ INFO: suspicious RCU usage. ] 4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S ------------------------------- include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/8/0. stack backtrace: CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S 4.4.0-00006-g0fe53e8-dirty #34 Call Trace: [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable) [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170 [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440 [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100 [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150 [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140 [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310 [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60 [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40 [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560 [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360 [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14 This warning is not a false positive either. RCU is not protecting code that is being executed while the CPU is offline. Instead of playing "whack-a-mole(TM)" and adding conditional statements to the tracepoints we find that are used in this instance, simply add a cpu_online() test to the tracepoint code where the tracepoint will be ignored if the CPU is offline. Use of raw_smp_processor_id() is fine, as there should never be a case where the tracepoint code goes from running on a CPU that is online and suddenly gets migrated to a CPU that is offline. Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org Reported-by: Denis Kirjanov Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints") Cc: stable@vger.kernel.org # v2.6.28+ Signed-off-by: Steven Rostedt Signed-off-by: Sasha Levin commit 6bdec2b3a4bf4ecc5f9ba5ea24457fe57943fc61 Author: Andy Shevchenko Date: Wed Feb 10 15:59:42 2016 +0200 dmaengine: dw: disable BLOCK IRQs for non-cyclic xfer [ Upstream commit ee1cdcdae59563535485a5f56ee72c894ab7d7ad ] The commit 2895b2cad6e7 ("dmaengine: dw: fix cyclic transfer callbacks") re-enabled BLOCK interrupts with regard to make cyclic transfers work. However, this change becomes a regression for non-cyclic transfers as interrupt counters under stress test had been grown enormously (approximately per 4-5 bytes in the UART loop back test). Taking into consideration above enable BLOCK interrupts if and only if channel is programmed to perform cyclic transfer. Fixes: 2895b2cad6e7 ("dmaengine: dw: fix cyclic transfer callbacks") Signed-off-by: Andy Shevchenko Acked-by: Mans Rullgard Tested-by: Mans Rullgard Acked-by: Viresh Kumar Cc: Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 800b26634a89a88bbbc341c4d1e84f5141c1e38a Author: Takashi Iwai Date: Mon Feb 15 16:37:24 2016 +0100 ALSA: hda - Cancel probe work instead of flush at remove [ Upstream commit 0b8c82190c12e530eb6003720dac103bf63e146e ] The commit [991f86d7ae4e: ALSA: hda - Flush the pending probe work at remove] introduced the sync of async probe work at remove for fixing the race. However, this may lead to another hangup when the module removal is performed quickly before starting the probe work, because it issues flush_work() and it's blocked forever. The workaround is to use cancel_work_sync() instead of flush_work() there. Fixes: 991f86d7ae4e ('ALSA: hda - Flush the pending probe work at remove') Cc: # v3.17+ Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit d26d8b485d21ecc9e466b2a1e02c5962e46624de Author: Takashi Iwai Date: Mon Feb 15 16:20:24 2016 +0100 ALSA: seq: Fix leak of pool buffer at concurrent writes [ Upstream commit d99a36f4728fcbcc501b78447f625bdcce15b842 ] When multiple concurrent writes happen on the ALSA sequencer device right after the open, it may try to allocate vmalloc buffer for each write and leak some of them. It's because the presence check and the assignment of the buffer is done outside the spinlock for the pool. The fix is to move the check and the assignment into the spinlock. (The current implementation is suboptimal, as there can be multiple unnecessary vmallocs because the allocation is done before the check in the spinlock. But the pool size is already checked beforehand, so this isn't a big problem; that is, the only possible path is the multiple writes before any pool assignment, and practically seen, the current coverage should be "good enough".) The issue was triggered by syzkaller fuzzer. BugLink: http://lkml.kernel.org/r/CACT4Y+bSzazpXNvtAr=WXaL8hptqjHwqEyFA+VN2AWEx=aurkg@mail.gmail.com Reported-by: Dmitry Vyukov Tested-by: Dmitry Vyukov Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 712394523149ed77377fcc9a264ac35245bd16a3 Author: Gavin Shan Date: Tue Feb 9 15:50:21 2016 +1100 powerpc/eeh: Fix stale cached primary bus [ Upstream commit 05ba75f848647135f063199dc0e9f40fee769724 ] When PE is created, its primary bus is cached to pe->bus. At later point, the cached primary bus is returned from eeh_pe_bus_get(). However, we could get stale cached primary bus and run into kernel crash in one case: full hotplug as part of fenced PHB error recovery releases all PCI busses under the PHB at unplugging time and recreate them at plugging time. pe->bus is still dereferencing the PCI bus that was released. This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity of pe->bus. pe->bus is updated when its first child EEH device is online and the flag is set. Before unplugging in full hotplug for error recovery, the flag is cleared. Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE") Cc: stable@vger.kernel.org #v3.11+ Reported-by: Andrew Donnellan Reported-by: Pradipta Ghosh Signed-off-by: Gavin Shan Tested-by: Andrew Donnellan Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin commit 1ea63b629c9c53af6cdde4daf166b3d31b3e9cfe Author: Andrey Konovalov Date: Sat Feb 13 11:08:06 2016 +0300 ALSA: usb-audio: avoid freeing umidi object twice [ Upstream commit 07d86ca93db7e5cdf4743564d98292042ec21af7 ] The 'umidi' object will be free'd on the error path by snd_usbmidi_free() when tearing down the rawmidi interface. So we shouldn't try to free it in snd_usbmidi_create() after having registered the rawmidi interface. Found by KASAN. Signed-off-by: Andrey Konovalov Acked-by: Clemens Ladisch Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit f3dd341b9230ac943f317efa4c5b1bfae587434b Author: Hannes Reinecke Date: Fri Feb 12 09:39:15 2016 +0100 bio: return EINTR if copying to user space got interrupted [ Upstream commit 2d99b55d378c996b9692a0c93dd25f4ed5d58934 ] Commit 35dc248383bbab0a7203fca4d722875bc81ef091 introduced a check for current->mm to see if we have a user space context and only copies data if we do. Now if an IO gets interrupted by a signal data isn't copied into user space any more (as we don't have a user space context) but user space isn't notified about it. This patch modifies the behaviour to return -EINTR from bio_uncopy_user() to notify userland that a signal has interrupted the syscall, otherwise it could lead to a situation where the caller may get a buffer with no data returned. This can be reproduced by issuing SG_IO ioctl()s in one thread while constantly sending signals to it. Fixes: 35dc248 [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal Signed-off-by: Johannes Thumshirn Signed-off-by: Hannes Reinecke Cc: stable@vger.kernel.org # v.3.11+ Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit d185fa457006e98aa975ed6c0e7d2ddfe3d26695 Author: Ryan Ware Date: Thu Feb 11 15:58:44 2016 -0800 EVM: Use crypto_memneq() for digest comparisons [ Upstream commit 613317bd212c585c20796c10afe5daaa95d4b0a1 ] This patch fixes vulnerability CVE-2016-2085. The problem exists because the vm_verify_hmac() function includes a use of memcmp(). Unfortunately, this allows timing side channel attacks; specifically a MAC forgery complexity drop from 2^128 to 2^12. This patch changes the memcmp() to the cryptographically safe crypto_memneq(). Reported-by: Xiaofei Rex Guo Signed-off-by: Ryan Ware Cc: stable@vger.kernel.org Signed-off-by: Mimi Zohar Signed-off-by: James Morris Signed-off-by: Sasha Levin commit a285eee1ce6b0efc4bcc2599cd3fd96dd6103e6d Author: Eryu Guan Date: Fri Feb 12 01:20:43 2016 -0500 ext4: don't read blocks from disk after extents being swapped [ Upstream commit bcff24887d00bce102e0857d7b0a8c44a40f53d1 ] I notice ext4/307 fails occasionally on ppc64 host, reporting md5 checksum mismatch after moving data from original file to donor file. The reason is that move_extent_per_page() calls __block_write_begin() and block_commit_write() to write saved data from original inode blocks to donor inode blocks, but __block_write_begin() not only maps buffer heads but also reads block content from disk if the size is not block size aligned. At this time the physical block number in mapped buffer head is pointing to the donor file not the original file, and that results in reading wrong data to page, which get written to disk in following block_commit_write call. This also can be reproduced by the following script on 1k block size ext4 on x86_64 host: mnt=/mnt/ext4 donorfile=$mnt/donor testfile=$mnt/testfile e4compact=~/xfstests/src/e4compact rm -f $donorfile $testfile # reserve space for donor file, written by 0xaa and sync to disk to # avoid EBUSY on EXT4_IOC_MOVE_EXT xfs_io -fc "pwrite -S 0xaa 0 1m" -c "fsync" $donorfile # create test file written by 0xbb xfs_io -fc "pwrite -S 0xbb 0 1023" -c "fsync" $testfile # compute initial md5sum md5sum $testfile | tee md5sum.txt # drop cache, force e4compact to read data from disk echo 3 > /proc/sys/vm/drop_caches # test defrag echo "$testfile" | $e4compact -i -v -f $donorfile # check md5sum md5sum -c md5sum.txt Fix it by creating & mapping buffer heads only but not reading blocks from disk, because all the data in page is guaranteed to be up-to-date in mext_page_mkuptodate(). Cc: stable@vger.kernel.org Signed-off-by: Eryu Guan Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin commit 9d7679794e9cd701890d997a76024b2239f90638 Author: Insu Yun Date: Fri Feb 12 01:15:59 2016 -0500 ext4: fix potential integer overflow [ Upstream commit 46901760b46064964b41015d00c140c83aa05bcf ] Since sizeof(ext_new_group_data) > sizeof(ext_new_flex_group_data), integer overflow could be happened. Therefore, need to fix integer overflow sanitization. Cc: stable@vger.kernel.org Signed-off-by: Insu Yun Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin commit 3a68deb6673cffe2d66f0f0ace615bc0cb4c1153 Author: David Sterba Date: Fri Nov 13 13:44:28 2015 +0100 btrfs: properly set the termination value of ctx->pos in readdir [ Upstream commit bc4ef7592f657ae81b017207a1098817126ad4cb ] The value of ctx->pos in the last readdir call is supposed to be set to INT_MAX due to 32bit compatibility, unless 'pos' is intentially set to a larger value, then it's LLONG_MAX. There's a report from PaX SIZE_OVERFLOW plugin that "ctx->pos++" overflows (https://forums.grsecurity.net/viewtopic.php?f=1&t=4284), on a 64bit arch, where the value is 0x7fffffffffffffff ie. LLONG_MAX before the increment. We can get to that situation like that: * emit all regular readdir entries * still in the same call to readdir, bump the last pos to INT_MAX * next call to readdir will not emit any entries, but will reach the bump code again, finds pos to be INT_MAX and sets it to LLONG_MAX Normally this is not a problem, but if we call readdir again, we'll find 'pos' set to LLONG_MAX and the unconditional increment will overflow. The report from Victor at (http://thread.gmane.org/gmane.comp.file-systems.btrfs/49500) with debugging print shows that pattern: Overflow: e Overflow: 7fffffff Overflow: 7fffffffffffffff PAX: size overflow detected in function btrfs_real_readdir fs/btrfs/inode.c:5760 cicus.935_282 max, count: 9, decl: pos; num: 0; context: dir_context; CPU: 0 PID: 2630 Comm: polkitd Not tainted 4.2.3-grsec #1 Hardware name: Gigabyte Technology Co., Ltd. H81ND2H/H81ND2H, BIOS F3 08/11/2015 ffffffff81901608 0000000000000000 ffffffff819015e6 ffffc90004973d48 ffffffff81742f0f 0000000000000007 ffffffff81901608 ffffc90004973d78 ffffffff811cb706 0000000000000000 ffff8800d47359e0 ffffc90004973ed8 Call Trace: [] dump_stack+0x4c/0x7f [] report_size_overflow+0x36/0x40 [] btrfs_real_readdir+0x69c/0x6d0 [] iterate_dir+0xa8/0x150 [] ? __fget_light+0x2d/0x70 [] SyS_getdents+0xba/0x1c0 Overflow: 1a [] ? iterate_dir+0x150/0x150 [] entry_SYSCALL_64_fastpath+0x12/0x83 The jump from 7fffffff to 7fffffffffffffff happens when new dir entries are not yet synced and are processed from the delayed list. Then the code could go to the bump section again even though it might not emit any new dir entries from the delayed list. The fix avoids entering the "bump" section again once we've finished emitting the entries, both for synced and delayed entries. References: https://forums.grsecurity.net/viewtopic.php?f=1&t=4284 Reported-by: Victor CC: stable@vger.kernel.org Signed-off-by: David Sterba Tested-by: Holger Hoffstätte Signed-off-by: Chris Mason Signed-off-by: Sasha Levin commit 1c35d875d020c798da316ddbac5da7be93b6901c Author: Linus Walleij Date: Wed Feb 10 09:25:17 2016 +0100 ARM: 8519/1: ICST: try other dividends than 1 [ Upstream commit e972c37459c813190461dabfeaac228e00aae259 ] Since the dawn of time the ICST code has only supported divide by one or hang in an eternal loop. Luckily we were always dividing by one because the reference frequency for the systems using the ICSTs is 24MHz and the [min,max] values for the PLL input if [10,320] MHz for ICST307 and [6,200] for ICST525, so the loop will always terminate immediately without assigning any divisor for the reference frequency. But for the code to make sense, let's insert the missing i++ Reported-by: David Binderman Cc: stable@vger.kernel.org Signed-off-by: Linus Walleij Signed-off-by: Russell King Signed-off-by: Sasha Levin commit 5ae3f5ad236f18bda2486bc557333427ec8a6797 Author: Stefan Haberland Date: Tue Dec 15 10:45:05 2015 +0100 s390/dasd: fix refcount for PAV reassignment [ Upstream commit 9d862ababb609439c5d6987f6d3ddd09e703aa0b ] Add refcount to the DASD device when a summary unit check worker is scheduled. This prevents that the device is set offline with worker in place. Cc: stable@vger.kernel.org Signed-off-by: Stefan Haberland Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin commit 2e35e9c4671925a7af5a2d3fe4f5f614e13a0d6d Author: Stefan Haberland Date: Tue Dec 15 10:16:43 2015 +0100 s390/dasd: prevent incorrect length error under z/VM after PAV changes [ Upstream commit 020bf042e5b397479c1174081b935d0ff15d1a64 ] The channel checks the specified length and the provided amount of data for CCWs and provides an incorrect length error if the size does not match. Under z/VM with simulation activated the length may get changed. Having the suppress length indication bit set is stated as good CCW coding practice and avoids errors under z/VM. Cc: stable@vger.kernel.org Signed-off-by: Stefan Haberland Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin commit 4f55047bff0dd28d65e59455fb16d23b8dafcf73 Author: Anton Protopopov Date: Wed Feb 10 12:50:21 2016 -0500 cifs: fix erroneous return value [ Upstream commit 4b550af519854421dfec9f7732cdddeb057134b2 ] The setup_ntlmv2_rsp() function may return positive value ENOMEM instead of -ENOMEM in case of kmalloc failure. Signed-off-by: Anton Protopopov CC: Stable Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 7071cc9b4f9d4fd6dee20bb8dd760a8f9e1ceec7 Author: Nicolai Hähnle Date: Fri Feb 5 14:35:53 2016 -0500 drm/radeon: hold reference to fences in radeon_sa_bo_new [ Upstream commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb ] An arbitrary amount of time can pass between spin_unlock and radeon_fence_wait_any, so we need to ensure that nobody frees the fences from under us. Based on the analogous fix for amdgpu. Signed-off-by: Nicolai Hähnle Reviewed-by: Christian König Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit f3d69fd89df665c1fa1931f4af846039a4ac5dc4 Author: Tejun Heo Date: Wed Feb 3 13:54:25 2016 -0500 workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup [ Upstream commit d6e022f1d207a161cd88e08ef0371554680ffc46 ] When looking up the pool_workqueue to use for an unbound workqueue, workqueue assumes that the target CPU is always bound to a valid NUMA node. However, currently, when a CPU goes offline, the mapping is destroyed and cpu_to_node() returns NUMA_NO_NODE. This has always been broken but hasn't triggered often enough before 874bbfe600a6 ("workqueue: make sure delayed work run in local cpu"). After the commit, workqueue forcifully assigns the local CPU for delayed work items without explicit target CPU to fix a different issue. This widens the window where CPU can go offline while a delayed work item is pending causing delayed work items dispatched with target CPU set to an already offlined CPU. The resulting NUMA_NO_NODE mapping makes workqueue try to queue the work item on a NULL pool_workqueue and thus crash. While 874bbfe600a6 has been reverted for a different reason making the bug less visible again, it can still happen. Fix it by mapping NUMA_NO_NODE to the default pool_workqueue from unbound_pwq_by_node(). This is a temporary workaround. The long term solution is keeping CPU -> NODE mapping stable across CPU off/online cycles which is being worked on. Signed-off-by: Tejun Heo Reported-by: Mike Galbraith Cc: Tang Chen Cc: Rafael J. Wysocki Cc: Len Brown Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/g/1454424264.11183.46.camel@gmail.com Link: http://lkml.kernel.org/g/1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.com Signed-off-by: Sasha Levin commit d3c4dd8843bef1885fabaa9f61d5d354ec2a5e3a Author: Lai Jiangshan Date: Tue May 12 20:32:29 2015 +0800 workqueue: wq_pool_mutex protects the attrs-installation [ Upstream commit 5b95e1af8d17d85a17728f6de7dbff538e6e3c49 ] Current wq_pool_mutex doesn't proctect the attrs-installation, it results that ->unbound_attrs, ->numa_pwq_tbl[] and ->dfl_pwq can only be accessed under wq->mutex and causes some inconveniences. Example, wq_update_unbound_numa() has to acquire wq->mutex before fetching the wq->unbound_attrs->no_numa and the old_pwq. attrs-installation is a short operation, so this change will no cause any latency for other operations which also acquire the wq_pool_mutex. The only unprotected attrs-installation code is in apply_workqueue_attrs(), so this patch touches code less than comments. It is also a preparation patch for next several patches which read wq->unbound_attrs, wq->numa_pwq_tbl[] and wq->dfl_pwq with only wq_pool_mutex held. Signed-off-by: Lai Jiangshan Signed-off-by: Tejun Heo Signed-off-by: Sasha Levin commit 9e1a3771b412f52694e22a93cf188dbe9f0eab24 Author: Lai Jiangshan Date: Mon Apr 27 17:58:38 2015 +0800 workqueue: split apply_workqueue_attrs() into 3 stages [ Upstream commit 2d5f0764b5264d2954ba6e3deb04f4f5de8e4476 ] Current apply_workqueue_attrs() includes pwqs-allocation and pwqs-installation, so when we batch multiple apply_workqueue_attrs()s as a transaction, we can't ensure the transaction must succeed or fail as a complete unit. To solve this, we split apply_workqueue_attrs() into three stages. The first stage does the preparation: allocation memory, pwqs. The second stage does the attrs-installaion and pwqs-installation. The third stage frees the allocated memory and (old or unused) pwqs. As the result, batching multiple apply_workqueue_attrs()s can succeed or fail as a complete unit: 1) batch do all the first stage for all the workqueues 2) only commit all when all the above succeed. This patch is a preparation for the next patch ("Allow modifying low level unbound workqueue cpumask") which will do a multiple apply_workqueue_attrs(). The patch doesn't have functionality changed except two minor adjustment: 1) free_unbound_pwq() for the error path is removed, we use the heavier version put_pwq_unlocked() instead since the error path is rare. this adjustment simplifies the code. 2) the memory-allocation is also moved into wq_pool_mutex. this is needed to avoid to do the further splitting. tj: minor updates to comments. Suggested-by: Tejun Heo Cc: Christoph Lameter Cc: Kevin Hilman Cc: Lai Jiangshan Cc: Mike Galbraith Cc: Paul E. McKenney Cc: Tejun Heo Cc: Viresh Kumar Cc: Frederic Weisbecker Signed-off-by: Lai Jiangshan Signed-off-by: Tejun Heo Signed-off-by: Sasha Levin commit de7b555fc95c1e54b6f4b1d0487ad0cc13e30eca Author: Alexandra Yates Date: Fri Feb 5 15:27:49 2016 -0800 ahci: Intel DNV device IDs SATA [ Upstream commit 342decff2b846b46fa61eb5ee40986fab79a9a32 ] Adding Intel codename DNV platform device IDs for SATA. Signed-off-by: Alexandra Yates Signed-off-by: Tejun Heo Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit bb789d042685425c595b63a01ed555d5c1c404eb Author: Tony Lindgren Date: Mon Nov 30 21:39:54 2015 -0800 phy: twl4030-usb: Fix unbalanced pm_runtime_enable on module reload [ Upstream commit 58a66dba1beac2121d931cda4682ae4d40816af5 ] If we reload phy-twl4030-usb, we get a warning about unbalanced pm_runtime_enable. Let's fix the issue and also fix idling of the device on unload before we attempt to shut it down. If we don't properly idle the PHY before shutting it down on removal, the twl4030 ends up consuming about 62mW of extra power compared to running idle with the module loaded. Cc: stable@vger.kernel.org Cc: Bin Liu Cc: Felipe Balbi Cc: Kishon Vijay Abraham I Cc: NeilBrown Signed-off-by: Tony Lindgren Signed-off-by: Kishon Vijay Abraham I Signed-off-by: Sasha Levin commit feebbdd7d01430b5ca620bec2b637909d498fa0f Author: Tony Lindgren Date: Mon Nov 30 21:39:53 2015 -0800 phy: twl4030-usb: Relase usb phy on unload [ Upstream commit b241d31ef2f6a289d33dcaa004714b26e06f476f ] Otherwise rmmod omap2430; rmmod phy-twl4030-usb; modprobe omap2430 will try to use a non-existing phy and oops: Unable to handle kernel paging request at virtual address b6f7c1f0 ... [] (devm_usb_get_phy_by_node) from [] (omap2430_musb_init+0x44/0x2b4 [omap2430]) [] (omap2430_musb_init [omap2430]) from [] (musb_init_controller+0x194/0x878 [musb_hdrc]) Cc: stable@vger.kernel.org Cc: Bin Liu Cc: Felipe Balbi Cc: Kishon Vijay Abraham I Cc: NeilBrown Signed-off-by: Tony Lindgren Signed-off-by: Kishon Vijay Abraham I Signed-off-by: Sasha Levin commit 7822d8eb3551dce8f0e594807d030cf8be342540 Author: Shawn Lin Date: Thu Jan 28 16:14:18 2016 +0800 phy: core: fix wrong err handle for phy_power_on [ Upstream commit b82fcabe212a11698fd4b3e604d2f81d929d22f6 ] If phy_pm_runtime_get_sync failed but we already enable regulator, current code return directly without doing regulator_disable. This patch fix this problem and cleanup err handle of phy_power_on to be more readable. Fixes: 3be88125d85d ("phy: core: Support regulator ...") Cc: # v3.18+ Cc: Roger Quadros Cc: Axel Lin Signed-off-by: Shawn Lin Signed-off-by: Kishon Vijay Abraham I Signed-off-by: Sasha Levin commit 68fce03ba7901aa338a566292a59e6a753948861 Author: Tejun Heo Date: Tue Feb 9 16:11:26 2016 -0500 Revert "workqueue: make sure delayed work run in local cpu" [ Upstream commit 041bd12e272c53a35c54c13875839bcb98c999ce ] This reverts commit 874bbfe600a660cba9c776b3957b1ce393151b76. Workqueue used to implicity guarantee that work items queued without explicit CPU specified are put on the local CPU. Recent changes in timer broke the guarantee and led to vmstat breakage which was fixed by 176bed1de5bf ("vmstat: explicitly schedule per-cpu work on the CPU we need it to run on"). vmstat is the most likely to expose the issue and it's quite possible that there are other similar problems which are a lot more difficult to trigger. As a preventive measure, 874bbfe600a6 ("workqueue: make sure delayed work run in local cpu") was applied to restore the local CPU guarnatee. Unfortunately, the change exposed a bug in timer code which got fixed by 22b886dd1018 ("timers: Use proper base migration in add_timer_on()"). Due to code restructuring, the commit couldn't be backported beyond certain point and stable kernels which only had 874bbfe600a6 started crashing. The local CPU guarantee was accidental more than anything else and we want to get rid of it anyway. As, with the vmstat case fixed, 874bbfe600a6 is causing more problems than it's fixing, it has been decided to take the chance and officially break the guarantee by reverting the commit. A debug feature will be added to force foreign CPU assignment to expose cases relying on the guarantee and fixes for the individual cases will be backported to stable as necessary. Signed-off-by: Tejun Heo Fixes: 874bbfe600a6 ("workqueue: make sure delayed work run in local cpu") Link: http://lkml.kernel.org/g/20160120211926.GJ10810@quack.suse.cz Cc: stable@vger.kernel.org Cc: Mike Galbraith Cc: Henrique de Moraes Holschuh Cc: Daniel Bilik Cc: Jan Kara Cc: Shaohua Li Cc: Sasha Levin Cc: Ben Hutchings Cc: Thomas Gleixner Cc: Daniel Bilik Cc: Jiri Slaby Cc: Michal Hocko Signed-off-by: Sasha Levin commit 0163f1a71f10b25eae8d7019124cd7f1141b109a Author: Takashi Iwai Date: Mon Feb 8 17:26:58 2016 +0100 ALSA: timer: Fix race at concurrent reads [ Upstream commit 4dff5c7b7093b19c19d3a100f8a3ad87cb7cd9e7 ] snd_timer_user_read() has a potential race among parallel reads, as qhead and qused are updated outside the critical section due to copy_to_user() calls. Move them into the critical section, and also sanitize the relevant code a bit. Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit dbe8ccc27ee453e435597e636b4c7445fdc8c6d5 Author: Takashi Iwai Date: Tue Feb 9 10:23:52 2016 +0100 ALSA: hda - Fix bad dereference of jack object [ Upstream commit 2ebab40eb74a0225d5dfba72bfae317dd948fa2d ] The hda_jack_tbl entries are managed by snd_array for allowing multiple jacks. It's good per se, but the problem is that struct hda_jack_callback keeps the hda_jack_tbl pointer. Since snd_array doesn't preserve each pointer at resizing the array, we can't keep the original pointer but have to deduce the pointer at each time via snd_array_entry() instead. Actually, this resulted in the deference to the wrong pointer on codecs that have many pins such as CS4208. This patch replaces the pointer to the NID value as the search key. As an unexpected good side effect, this even simplifies the code, as only NID is needed in most cases. Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 9b7283948d5f83ac692ed0243e17e2f1d3843c6c Author: Takashi Iwai Date: Tue Feb 9 12:02:32 2016 +0100 ALSA: timer: Fix race between stop and interrupt [ Upstream commit ed8b1d6d2c741ab26d60d499d7fbb7ac801f0f51 ] A slave timer element also unlinks at snd_timer_stop() but it takes only slave_active_lock. When a slave is assigned to a master, however, this may become a race against the master's interrupt handling, eventually resulting in a list corruption. The actual bug could be seen with a syzkaller fuzzer test case in BugLink below. As a fix, we need to take timeri->timer->lock when timer isn't NULL, i.e. assigned to a master, while the assignment to a master itself is protected by slave_active_lock. BugLink: http://lkml.kernel.org/r/CACT4Y+Y_Bm+7epAb=8Wi=AaWd+DYS7qawX52qxdCfOfY49vozQ@mail.gmail.com Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit ca54f8d139ae5b261d3dba4c71bb08b46fc8f9ea Author: Linus Walleij Date: Mon Feb 8 09:14:37 2016 +0100 ARM: 8517/1: ICST: avoid arithmetic overflow in icst_hz() [ Upstream commit 5070fb14a0154f075c8b418e5bc58a620ae85a45 ] When trying to set the ICST 307 clock to 25174000 Hz I ran into this arithmetic error: the icst_hz_to_vco() correctly figure out DIVIDE=2, RDW=100 and VDW=99 yielding a frequency of 25174000 Hz out of the VCO. (I replicated the icst_hz() function in a spreadsheet to verify this.) However, when I called icst_hz() on these VCO settings it would instead return 4122709 Hz. This causes an error in the common clock driver for ICST as the common clock framework will call .round_rate() on the clock which will utilize icst_hz_to_vco() followed by icst_hz() suggesting the erroneous frequency, and then the clock gets set to this. The error did not manifest in the old clock framework since this high frequency was only used by the CLCD, which calls clk_set_rate() without first calling clk_round_rate() and since the old clock framework would not call clk_round_rate() before setting the frequency, the correct values propagated into the VCO. After some experimenting I figured out that it was due to a simple arithmetic overflow: the divisor for 24Mhz reference frequency as reference becomes 24000000*2*(99+8)=0x132212400 and the "1" in bit 32 overflows and is lost. But introducing an explicit 64-by-32 bit do_div() and casting the divisor into (u64) we get the right frequency back, and the right frequency gets set. Tested on the ARM Versatile. Cc: stable@vger.kernel.org Cc: linux-clk@vger.kernel.org Cc: Pawel Moll Signed-off-by: Linus Walleij Signed-off-by: Russell King Signed-off-by: Sasha Levin commit 47eef3475c5f3063fc20e938f2fe24cd84d731aa Author: Takashi Iwai Date: Mon Feb 8 17:36:25 2016 +0100 ALSA: timer: Fix wrong instance passed to slave callbacks [ Upstream commit 117159f0b9d392fb433a7871426fad50317f06f7 ] In snd_timer_notify1(), the wrong timer instance was passed for slave ccallback function. This leads to the access to the wrong data when an incompatible master is handled (e.g. the master is the sequencer timer and the slave is a user timer), as spotted by syzkaller fuzzer. This patch fixes that wrong assignment. BugLink: http://lkml.kernel.org/r/CACT4Y+Y_Bm+7epAb=8Wi=AaWd+DYS7qawX52qxdCfOfY49vozQ@mail.gmail.com Reported-by: Dmitry Vyukov Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 6b20ea56ddee5353a29616692723c80793fb05c5 Author: Jani Nikula Date: Thu Feb 4 12:50:50 2016 +0200 drm/i915/dsi: don't pass arbitrary data to sideband [ Upstream commit 26f6f2d301c1fb46acb1138ee155125815239b0d ] Since sequence block v2 the second byte contains flags other than just pull up/down. Don't pass arbitrary data to the sideband interface. The rest may or may not work for sequence block v2, but there should be no harm done. Cc: stable@vger.kernel.org Reviewed-by: Ville Syrjälä Signed-off-by: Jani Nikula Link: http://patchwork.freedesktop.org/patch/msgid/ebe3c2eee623afc4b3a134533b01f8d591d13f32.1454582914.git.jani.nikula@intel.com (cherry picked from commit 4e1c63e3761b84ec7d87c75b58bbc8bcf18e98ee) Signed-off-by: Jani Nikula Signed-off-by: Sasha Levin commit 3c9dc5750d0c3052ba7a3bf0173ba429bc96f323 Author: Jani Nikula Date: Thu Feb 4 12:50:49 2016 +0200 drm/i915/dsi: defend gpio table against out of bounds access [ Upstream commit 4db3a2448ec8902310acb78de39b6227a9a56ac8 ] Do not blindly trust the VBT data used for indexing. Cc: stable@vger.kernel.org Reviewed-by: Ville Syrjälä Signed-off-by: Jani Nikula Link: http://patchwork.freedesktop.org/patch/msgid/cc32d40c2b47f2d2151811855ac2c3dabab1d57d.1454582914.git.jani.nikula@intel.com (cherry picked from commit 5d2d0a12d3d08bf50434f0b5947bb73bac04b941) Signed-off-by: Jani Nikula Signed-off-by: Sasha Levin commit c5e67773f634806feecb3779baf2158e182ccd8a Author: Takashi Iwai Date: Tue Feb 2 15:27:36 2016 +0100 ALSA: dummy: Implement timer backend switching more safely [ Upstream commit ddce57a6f0a2d8d1bfacfa77f06043bc760403c2 ] Currently the selected timer backend is referred at any moment from the running PCM callbacks. When the backend is switched, it's possible to lead to inconsistency from the running backend. This was pointed by syzkaller fuzzer, and the commit [7ee96216c31a: ALSA: dummy: Disable switching timer backend via sysfs] disabled the dynamic switching for avoiding the crash. This patch improves the handling of timer backend switching. It keeps the reference to the selected backend during the whole operation of an opened stream so that it won't be changed by other streams. Together with this change, the hrtimer parameter is reenabled as writable now. NOTE: this patch also turned out to fix the still remaining race. Namely, ops was still replaced dynamically at dummy_pcm_open: static int dummy_pcm_open(struct snd_pcm_substream *substream) { .... dummy->timer_ops = &dummy_systimer_ops; if (hrtimer) dummy->timer_ops = &dummy_hrtimer_ops; Since dummy->timer_ops is common among all streams, and when the replacement happens during accesses of other streams, it may lead to a crash. This was actually triggered by syzkaller fuzzer and KASAN. This patch rewrites the code not to use the ops shared by all streams any longer, too. BugLink: http://lkml.kernel.org/r/CACT4Y+aZ+xisrpuM6cOXbL21DuM0yVxPYXf4cD4Md9uw0C3dBQ@mail.gmail.com Reported-by: Dmitry Vyukov Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 84ec02eecb3b257f14cb8b10ffcd73d5420c96c0 Author: James Bottomley Date: Wed Jan 13 08:10:31 2016 -0800 klist: fix starting point removed bug in klist iterators [ Upstream commit 00cd29b799e3449f0c68b1cc77cd4a5f95b42d17 ] The starting node for a klist iteration is often passed in from somewhere way above the klist infrastructure, meaning there's no guarantee the node is still on the list. We've seen this in SCSI where we use bus_find_device() to iterate through a list of devices. In the face of heavy hotplug activity, the last device returned by bus_find_device() can be removed before the next call. This leads to Dec 3 13:22:02 localhost kernel: WARNING: CPU: 2 PID: 28073 at include/linux/kref.h:47 klist_iter_init_node+0x3d/0x50() Dec 3 13:22:02 localhost kernel: Modules linked in: scsi_debug x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel joydev iTCO_wdt dcdbas ipmi_devintf acpi_power_meter iTCO_vendor_support ipmi_si imsghandler pcspkr wmi acpi_cpufreq tpm_tis tpm shpchp lpc_ich mfd_core nfsd nfs_acl lockd grace sunrpc tg3 ptp pps_core Dec 3 13:22:02 localhost kernel: CPU: 2 PID: 28073 Comm: cat Not tainted 4.4.0-rc1+ #2 Dec 3 13:22:02 localhost kernel: Hardware name: Dell Inc. PowerEdge R320/08VT7V, BIOS 2.0.22 11/19/2013 Dec 3 13:22:02 localhost kernel: ffffffff81a20e77 ffff880613acfd18 ffffffff81321eef 0000000000000000 Dec 3 13:22:02 localhost kernel: ffff880613acfd50 ffffffff8107ca52 ffff88061176b198 0000000000000000 Dec 3 13:22:02 localhost kernel: ffffffff814542b0 ffff880610cfb100 ffff88061176b198 ffff880613acfd60 Dec 3 13:22:02 localhost kernel: Call Trace: Dec 3 13:22:02 localhost kernel: [] dump_stack+0x44/0x55 Dec 3 13:22:02 localhost kernel: [] warn_slowpath_common+0x82/0xc0 Dec 3 13:22:02 localhost kernel: [] ? proc_scsi_show+0x20/0x20 Dec 3 13:22:02 localhost kernel: [] warn_slowpath_null+0x1a/0x20 Dec 3 13:22:02 localhost kernel: [] klist_iter_init_node+0x3d/0x50 Dec 3 13:22:02 localhost kernel: [] bus_find_device+0x51/0xb0 Dec 3 13:22:02 localhost kernel: [] scsi_seq_next+0x2d/0x40 [...] And an eventual crash. It can actually occur in any hotplug system which has a device finder and a starting device. We can fix this globally by making sure the starting node for klist_iter_init_node() is actually a member of the list before using it (and by starting from the beginning if it isn't). Reported-by: Ewan D. Milne Tested-by: Ewan D. Milne Cc: stable@vger.kernel.org Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 485037307b65916364a5d9ac77e44d5cb64073f0 Author: Takashi Iwai Date: Sun Feb 7 09:38:26 2016 +0100 ALSA: hda - Fix speaker output from VAIO AiO machines [ Upstream commit c44d9b1181cf34e0860c72cc8a00e0c47417aac0 ] Some Sony VAIO AiO models (VGC-JS4EF and VGC-JS25G, both with PCI SSID 104d:9044) need the same quirk to make the speaker working properly. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112031 Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 38bdf7e3fb81c3d9cbbacab6f96e9fd28ff6c704 Author: Herton R. Krzesinski Date: Thu Jan 14 17:56:58 2016 -0200 pty: make sure super_block is still valid in final /dev/tty close [ Upstream commit 1f55c718c290616889c04946864a13ef30f64929 ] Considering current pty code and multiple devpts instances, it's possible to umount a devpts file system while a program still has /dev/tty opened pointing to a previosuly closed pty pair in that instance. In the case all ptmx and pts/N files are closed, umount can be done. If the program closes /dev/tty after umount is done, devpts_kill_index will use now an invalid super_block, which was already destroyed in the umount operation after running ->kill_sb. This is another "use after free" type of issue, but now related to the allocated super_block instance. To avoid the problem (warning at ida_remove and potential crashes) for this specific case, I added two functions in devpts which grabs additional references to the super_block, which pty code now uses so it makes sure the super block structure is still valid until pty shutdown is done. I also moved the additional inode references to the same functions, which also covered similar case with inode being freed before /dev/tty final close/shutdown. Signed-off-by: Herton R. Krzesinski Cc: stable@vger.kernel.org # 2.6.29+ Reviewed-by: Peter Hurley Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 614f8734d11ad22ee17a5faecf355b70756904ef Author: Herton R. Krzesinski Date: Mon Jan 11 12:07:43 2016 -0200 pty: fix possible use after free of tty->driver_data [ Upstream commit 2831c89f42dcde440cfdccb9fee9f42d54bbc1ef ] This change fixes a bug for a corner case where we have the the last release from a pty master/slave coming from a previously opened /dev/tty file. When this happens, the tty->driver_data can be stale, due to all ptmx or pts/N files having already been closed before (and thus the inode related to these files, which tty->driver_data points to, being already freed/destroyed). The fix here is to keep a reference on the opened master ptmx inode. We maintain the inode referenced until the final pty_unix98_shutdown, and only pass this inode to devpts_kill_index. Signed-off-by: Herton R. Krzesinski Cc: # 2.6.29+ Reviewed-by: Peter Hurley Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 6ca45550112e975be2298b0feda49b686029cc32 Author: Jeremy McNicoll Date: Tue Feb 2 13:00:45 2016 -0800 tty: Add support for PCIe WCH382 2S multi-IO card [ Upstream commit 7dde55787b43a8f2b4021916db38d90c03a2ec64 ] WCH382 2S board is a PCIe card with 2 DB9 COM ports detected as Serial controller: Device 1c00:3253 (rev 10) (prog-if 05 [16850]) Signed-off-by: Jeremy McNicoll Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 56819934d6148c68c85eb181f6153b36f028f99c Author: Peter Hurley Date: Tue Jan 12 15:14:46 2016 -0800 serial: omap: Prevent DoS using unprivileged ioctl(TIOCSRS485) [ Upstream commit 308bbc9ab838d0ace0298268c7970ba9513e2c65 ] The omap-serial driver emulates RS485 delays using software timers, but neglects to clamp the input values from the unprivileged ioctl(TIOCSRS485). Because the software implementation busy-waits, malicious userspace could stall the cpu for ~49 days. Clamp the input values to < 100ms. Fixes: 4a0ac0f55b18 ("OMAP: add RS485 support") Cc: # 3.12+ Signed-off-by: Peter Hurley Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 8b6349ec0eefde22c3348b65804e343be1245a59 Author: Quinn Tran Date: Thu Feb 4 11:45:16 2016 -0500 qla2xxx: Use pci_enable_msix_range() instead of pci_enable_msix() [ Upstream commit 84e32a06f4f8756ce9ec3c8dc7e97896575f0771 ] As result of deprecation of MSI-X/MSI enablement functions pci_enable_msix() and pci_enable_msi_block() all drivers using these two interfaces need to be updated to use the new pci_enable_msi_range() or pci_enable_msi_exact() and pci_enable_msix_range() or pci_enable_msix_exact() interfaces. Log message code 0x00c6 preserved, although it is reported after successful call to pci_enable_msix_range(), not before possibly unsuccessful call to pci_enable_msix(). Consumers of the error code should not notice the difference. Signed-off-by: Alexander Gordeev Acked-by: Chad Dupuis Cc: qla2xxx-upstream@qlogic.com Cc: linux-scsi@vger.kernel.org Cc: linux-pci@vger.kernel.org Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit 0a86238d2e2ad9b74329622c72e99cacbbaf6485 Author: Cyrille Pitchen Date: Fri Feb 5 13:45:13 2016 +0100 crypto: atmel-sha - remove calls of clk_prepare() from atomic contexts [ Upstream commit ee36c87a655325a7b5e442a9650a782db4ea20d2 ] commit c033042aa8f69894df37dabcaa0231594834a4e4 upstream. clk_prepare()/clk_unprepare() must not be called within atomic context. This patch calls clk_prepare() once for all from atmel_sha_probe() and clk_unprepare() from atmel_sha_remove(). Then calls of clk_prepare_enable()/clk_disable_unprepare() were replaced by calls of clk_enable()/clk_disable(). Signed-off-by: Cyrille Pitchen Reported-by: Matthias Mayr Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 617c364c29ce8eefa9f0122fd01829adc2ac5d83 Author: LABBE Corentin Date: Fri Oct 2 14:12:58 2015 +0200 crypto: atmel - Check for clk_prepare_enable() return value [ Upstream commit 9d83d299549d0e121245d56954242750d0c14338 ] clk_prepare_enable() can fail so add a check for this and return the error code if it fails. Signed-off-by: LABBE Corentin Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 5955bf64814d8663117e9397373fb20f9f71dff0 Author: LABBE Corentin Date: Mon Oct 12 19:47:03 2015 +0200 crypto: atmel - use devm_xxx() managed function [ Upstream commit b0e8b3417a620e6e0a91fd526fbc6db78714198e ] Using the devm_xxx() managed function to stripdown the error and remove code. Signed-off-by: LABBE Corentin Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 64c4131f16acba869e5280ee46e7dbf93fd8b960 Author: Herbert Xu Date: Fri Feb 26 12:44:11 2016 +0100 Backport fix for crypto: algif_skcipher - Fix race condition in skcipher_check_key commit 1822793a523e5d5730b19cc21160ff1717421bc8 upstream. We need to lock the child socket in skcipher_check_key as otherwise two simultaneous calls can cause the parent socket to be freed. Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 7327b23abeb498c103125f234c7698115db5970b Author: Herbert Xu Date: Fri Feb 26 12:44:10 2016 +0100 Backport fix for crypto: algif_skcipher - Remove custom release parent function commit d7b65aee1e7b4c87922b0232eaba56a8a143a4a0 upstream. This patch removes the custom release parent function as the generic af_alg_release_parent now works for nokey sockets too. Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 1ce5c14e78a7c44a6f9c6af3548748dbe22a80af Author: Herbert Xu Date: Fri Feb 26 12:44:09 2016 +0100 Backport fix for crypto: algif_skcipher - Add nokey compatibility path commit a0fa2d037129a9849918a92d91b79ed6c7bd2818 upstream. This patch adds a compatibility path to support old applications that do acept(2) before setkey. Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 762330b161c49c6d88ab689a0ee2a1a959dc5b6b Author: Herbert Xu Date: Fri Feb 26 12:44:08 2016 +0100 Backport fix for crypto: algif_skcipher - Require setkey before accept(2) commit dd504589577d8e8e70f51f997ad487a4cb6c026f upstream. Some cipher implementations will crash if you try to use them without calling setkey first. This patch adds a check so that the accept(2) call will fail with -ENOKEY if setkey hasn't been done on the socket yet. Cc: stable@vger.kernel.org Reported-by: Dmitry Vyukov Signed-off-by: Herbert Xu Tested-by: Dmitry Vyukov [backported to 4.1 by Milan Broz ] Signed-off-by: Sasha Levin commit 09b09ced1e4e3afb2767ee96edc9d1d57fc1c55b Author: Mathias Krause Date: Mon Feb 1 14:27:30 2016 +0100 crypto: user - lock crypto_alg_list on alg dump [ Upstream commit 63e41ebc6630f39422d87f8a4bade1e793f37a01 ] We miss to take the crypto_alg_sem semaphore when traversing the crypto_alg_list for CRYPTO_MSG_GETALG dumps. This allows a race with crypto_unregister_alg() removing algorithms from the list while we're still traversing it, thereby leading to a use-after-free as show below: [ 3482.071639] general protection fault: 0000 [#1] SMP [ 3482.075639] Modules linked in: aes_x86_64 glue_helper lrw ablk_helper cryptd gf128mul ipv6 pcspkr serio_raw virtio_net microcode virtio_pci virtio_ring virtio sr_mod cdrom [last unloaded: aesni_intel] [ 3482.075639] CPU: 1 PID: 11065 Comm: crconf Not tainted 4.3.4-grsec+ #126 [ 3482.075639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 3482.075639] task: ffff88001cd41a40 ti: ffff88001cd422c8 task.ti: ffff88001cd422c8 [ 3482.075639] RIP: 0010:[] [] strncpy+0x13/0x30 [ 3482.075639] RSP: 0018:ffff88001f713b60 EFLAGS: 00010202 [ 3482.075639] RAX: ffff88001f6c4430 RBX: ffff88001f6c43a0 RCX: ffff88001f6c4430 [ 3482.075639] RDX: 0000000000000040 RSI: fefefefefefeff16 RDI: ffff88001f6c4430 [ 3482.075639] RBP: ffff88001f713b60 R08: ffff88001f6c4470 R09: ffff88001f6c4480 [ 3482.075639] R10: 0000000000000002 R11: 0000000000000246 R12: ffff88001ce2aa28 [ 3482.075639] R13: ffff880000093700 R14: ffff88001f5e4bf8 R15: 0000000000003b20 [ 3482.075639] FS: 0000033826fa2700(0000) GS:ffff88001e900000(0000) knlGS:0000000000000000 [ 3482.075639] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3482.075639] CR2: ffffffffff600400 CR3: 00000000139ec000 CR4: 00000000001606f0 [ 3482.075639] Stack: [ 3482.075639] ffff88001f713bd8 ffffffff936ccd00 ffff88001e5c4200 ffff880000093700 [ 3482.075639] ffff88001f713bd0 ffffffff938ef4bf 0000000000000000 0000000000003b20 [ 3482.075639] ffff88001f5e4bf8 ffff88001f5e4848 0000000000000000 0000000000003b20 [ 3482.075639] Call Trace: [ 3482.075639] [] crypto_report_alg+0xc0/0x3e0 [ 3482.075639] [] ? __alloc_skb+0x16f/0x300 [ 3482.075639] [] crypto_dump_report+0x6a/0x90 [ 3482.075639] [] netlink_dump+0x147/0x2e0 [ 3482.075639] [] __netlink_dump_start+0x159/0x190 [ 3482.075639] [] crypto_user_rcv_msg+0xc3/0x130 [ 3482.075639] [] ? crypto_report_alg+0x3e0/0x3e0 [ 3482.075639] [] ? alg_test_crc32c+0x120/0x120 [ 3482.075639] [] ? __netlink_lookup+0xd5/0x120 [ 3482.075639] [] ? crypto_add_alg+0x1d0/0x1d0 [ 3482.075639] [] netlink_rcv_skb+0xe1/0x130 [ 3482.075639] [] crypto_netlink_rcv+0x28/0x40 [ 3482.075639] [] netlink_unicast+0x108/0x180 [ 3482.075639] [] netlink_sendmsg+0x541/0x770 [ 3482.075639] [] sock_sendmsg+0x21/0x40 [ 3482.075639] [] SyS_sendto+0xf3/0x130 [ 3482.075639] [] ? bad_area_nosemaphore+0x13/0x20 [ 3482.075639] [] ? __do_page_fault+0x80/0x3a0 [ 3482.075639] [] entry_SYSCALL_64_fastpath+0x12/0x6e [ 3482.075639] Code: 88 4a ff 75 ed 5d 48 0f ba 2c 24 3f c3 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 85 d2 48 89 f8 48 89 f9 4c 8d 04 17 48 89 e5 74 15 <0f> b6 16 80 fa 01 88 11 48 83 de ff 48 83 c1 01 4c 39 c1 75 eb [ 3482.075639] RIP [] strncpy+0x13/0x30 To trigger the race run the following loops simultaneously for a while: $ while : ; do modprobe aesni-intel; rmmod aesni-intel; done $ while : ; do crconf show all > /dev/null; done Fix the race by taking the crypto_alg_sem read lock, thereby preventing crypto_unregister_alg() from modifying the algorithm list during the dump. This bug has been detected by the PaX memory sanitize feature. Cc: stable@vger.kernel.org Signed-off-by: Mathias Krause Cc: Steffen Klassert Cc: PaX Team Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 0951360a3c8e4633016836a1adec7c4b1f090f3c Author: Takashi Iwai Date: Fri Feb 5 20:12:24 2016 +0100 Revert "ALSA: hda - Fix noise on Gigabyte Z170X mobo" [ Upstream commit 6c361d10e0eb859233c71954abcd20d2d8700587 ] This reverts commit 0c25ad80408e95e0a4fbaf0056950206e95f726f. The original commit disabled the aamixer path due to the noise problem, but it turned out that some mobo with the same PCI SSID doesn't suffer from the issue, and the disabled function (analog loopback) is still demanded by users. Since the recent commit [e7fdd52779a6: ALSA: hda - Implement loopback control switch for Realtek and other codecs], we have the dynamic mixer switch to enable/disable the aamix path, and we don't have to disable the path statically any longer. So, let's revert the disablement, so that only the user suffering from the noise problem can turn off the aamix on the fly. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=108301 Reported-by: Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 6cf9d4c32e8e2716eb8e9cb578dbcb2b2cad04b3 Author: David Henningsson Date: Fri Feb 5 09:05:41 2016 +0100 ALSA: hda - Fix static checker warning in patch_hdmi.c [ Upstream commit 360a8245680053619205a3ae10e6bfe624a5da1d ] The static checker warning is: sound/pci/hda/patch_hdmi.c:460 hdmi_eld_ctl_get() error: __memcpy() 'eld->eld_buffer' too small (256 vs 512) I have a hard time figuring out if this can ever cause an information leak (I don't think so), but nonetheless it does not hurt to increase the robustness of the code. Fixes: 68e03de98507 ('ALSA: hda - hdmi: Do not expose eld data when eld is invalid') Reported-by: Dan Carpenter Signed-off-by: David Henningsson Cc: # v3.9+ Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 75b546e5147b81fe064454ba8a06ee58590f0f0d Author: Mika Westerberg Date: Wed Jan 27 16:19:13 2016 +0200 SCSI: Add Marvell Console to VPD blacklist [ Upstream commit 82c43310508eb19eb41fe7862e89afeb74030b84 ] I have a Marvell 88SE9230 SATA Controller that has some sort of integrated console SCSI device attached to one of the ports. ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata14.00: ATAPI: MARVELL VIRTUALL, 1.09, max UDMA/66 ata14.00: configured for UDMA/66 scsi 13:0:0:0: Processor Marvell Console 1.01 PQ: 0 ANSI: 5 Sending it VPD INQUIRY command seem to always fail with following error: ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 ata14.00: irq_stat 0x40000001 ata14.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 2 dma 16640 in Inquiry 12 01 00 00 ff 00res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x3 (HSM violation) ata14: hard resetting link This has been minor annoyance (only error printed on dmesg) until commit 09e2b0b14690 ("scsi: rescan VPD attributes") added call to scsi_attach_vpd() in scsi_rescan_device(). The commit causes the system to splat out following errors continuously without ever reaching the UI: ata14.00: configured for UDMA/66 ata14: EH complete ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 ata14.00: irq_stat 0x40000001 ata14.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 6 dma 16640 in Inquiry 12 01 00 00 ff 00res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x3 (HSM violation) ata14: hard resetting link ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata14.00: configured for UDMA/66 ata14: EH complete ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 ata14.00: irq_stat 0x40000001 ata14.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 7 dma 16640 in Inquiry 12 01 00 00 ff 00res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x3 (HSM violation) Without in-depth understanding of SCSI layer and the Marvell controller, I suspect this happens because when the link goes down (because of an error) we schedule scsi_rescan_device() which again fails to read VPD data... ad infinitum. Since VPD data cannot be read from the device anyway we prevent the SCSI layer from even trying by blacklisting the device. This gets away the error and the system starts up normally. [mkp: Widened the match to all revisions of this device] Cc: Signed-off-by: Mika Westerberg Reported-by: Kirill A. Shutemov Reported-by: Alexander Duyck Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit d1d1545f542c5ab95d2bbec49f7b425fcf1f4166 Author: Hannes Reinecke Date: Fri Jan 22 15:42:41 2016 +0100 scsi_dh_rdac: always retry MODE SELECT on command lock violation [ Upstream commit d2d06d4fe0f2cc2df9b17fefec96e6e1a1271d91 ] If MODE SELECT returns with sense '05/91/36' (command lock violation) it should always be retried without counting the number of retries. During an HBA upgrade or similar circumstances one might see a flood of MODE SELECT command from various HBAs, which will easily trigger the sense code and exceed the retry count. Cc: Signed-off-by: Hannes Reinecke Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 79179a68458a7fddcf35adb16d1f7a7880aa41e7 Author: Filipe Manana Date: Wed Feb 3 19:17:27 2016 +0000 Btrfs: fix hang on extent buffer lock caused by the inode_paths ioctl [ Upstream commit 0c0fe3b0fa45082cd752553fdb3a4b42503a118e ] While doing some tests I ran into an hang on an extent buffer's rwlock that produced the following trace: [39389.800012] NMI watchdog: BUG: soft lockup - CPU#15 stuck for 22s! [fdm-stress:32166] [39389.800016] NMI watchdog: BUG: soft lockup - CPU#14 stuck for 22s! [fdm-stress:32165] [39389.800016] Modules linked in: btrfs dm_mod ppdev xor sha256_generic hmac raid6_pq drbg ansi_cprng aesni_intel i2c_piix4 acpi_cpufreq aes_x86_64 ablk_helper tpm_tis parport_pc i2c_core sg cryptd evdev psmouse lrw tpm parport gf128mul serio_raw pcspkr glue_helper processor button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel scsi_mod e1000 virtio floppy [last unloaded: btrfs] [39389.800016] irq event stamp: 0 [39389.800016] hardirqs last enabled at (0): [< (null)>] (null) [39389.800016] hardirqs last disabled at (0): [] copy_process+0x638/0x1a35 [39389.800016] softirqs last enabled at (0): [] copy_process+0x638/0x1a35 [39389.800016] softirqs last disabled at (0): [< (null)>] (null) [39389.800016] CPU: 14 PID: 32165 Comm: fdm-stress Not tainted 4.4.0-rc6-btrfs-next-18+ #1 [39389.800016] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014 [39389.800016] task: ffff880175b1ca40 ti: ffff8800a185c000 task.ti: ffff8800a185c000 [39389.800016] RIP: 0010:[] [] queued_spin_lock_slowpath+0x57/0x158 [39389.800016] RSP: 0018:ffff8800a185fb80 EFLAGS: 00000202 [39389.800016] RAX: 0000000000000101 RBX: ffff8801710c4e9c RCX: 0000000000000101 [39389.800016] RDX: 0000000000000100 RSI: 0000000000000001 RDI: 0000000000000001 [39389.800016] RBP: ffff8800a185fb98 R08: 0000000000000001 R09: 0000000000000000 [39389.800016] R10: ffff8800a185fb68 R11: 6db6db6db6db6db7 R12: ffff8801710c4e98 [39389.800016] R13: ffff880175b1ca40 R14: ffff8800a185fc10 R15: ffff880175b1ca40 [39389.800016] FS: 00007f6d37fff700(0000) GS:ffff8802be9c0000(0000) knlGS:0000000000000000 [39389.800016] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [39389.800016] CR2: 00007f6d300019b8 CR3: 0000000037c93000 CR4: 00000000001406e0 [39389.800016] Stack: [39389.800016] ffff8801710c4e98 ffff8801710c4e98 ffff880175b1ca40 ffff8800a185fbb0 [39389.800016] ffffffff81091e11 ffff8801710c4e98 ffff8800a185fbc8 ffffffff81091895 [39389.800016] ffff8801710c4e98 ffff8800a185fbe8 ffffffff81486c5c ffffffffa067288c [39389.800016] Call Trace: [39389.800016] [] queued_read_lock_slowpath+0x46/0x60 [39389.800016] [] do_raw_read_lock+0x3e/0x41 [39389.800016] [] _raw_read_lock+0x3d/0x44 [39389.800016] [] ? btrfs_tree_read_lock+0x54/0x125 [btrfs] [39389.800016] [] btrfs_tree_read_lock+0x54/0x125 [btrfs] [39389.800016] [] ? btrfs_find_item+0xa7/0xd2 [btrfs] [39389.800016] [] btrfs_ref_to_path+0xd6/0x174 [btrfs] [39389.800016] [] inode_to_path+0x53/0xa2 [btrfs] [39389.800016] [] paths_from_inode+0x117/0x2ec [btrfs] [39389.800016] [] btrfs_ioctl+0xd5b/0x2793 [btrfs] [39389.800016] [] ? arch_local_irq_save+0x9/0xc [39389.800016] [] ? __this_cpu_preempt_check+0x13/0x15 [39389.800016] [] ? arch_local_irq_save+0x9/0xc [39389.800016] [] ? rcu_read_unlock+0x3e/0x5d [39389.800016] [] do_vfs_ioctl+0x42b/0x4ea [39389.800016] [] ? __fget_light+0x62/0x71 [39389.800016] [] SyS_ioctl+0x57/0x79 [39389.800016] [] entry_SYSCALL_64_fastpath+0x12/0x6f [39389.800016] Code: b9 01 01 00 00 f7 c6 00 ff ff ff 75 32 83 fe 01 89 ca 89 f0 0f 45 d7 f0 0f b1 13 39 f0 74 04 89 c6 eb e2 ff ca 0f 84 fa 00 00 00 <8b> 03 84 c0 74 04 f3 90 eb f6 66 c7 03 01 00 e9 e6 00 00 00 e8 [39389.800012] Modules linked in: btrfs dm_mod ppdev xor sha256_generic hmac raid6_pq drbg ansi_cprng aesni_intel i2c_piix4 acpi_cpufreq aes_x86_64 ablk_helper tpm_tis parport_pc i2c_core sg cryptd evdev psmouse lrw tpm parport gf128mul serio_raw pcspkr glue_helper processor button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel scsi_mod e1000 virtio floppy [last unloaded: btrfs] [39389.800012] irq event stamp: 0 [39389.800012] hardirqs last enabled at (0): [< (null)>] (null) [39389.800012] hardirqs last disabled at (0): [] copy_process+0x638/0x1a35 [39389.800012] softirqs last enabled at (0): [] copy_process+0x638/0x1a35 [39389.800012] softirqs last disabled at (0): [< (null)>] (null) [39389.800012] CPU: 15 PID: 32166 Comm: fdm-stress Tainted: G L 4.4.0-rc6-btrfs-next-18+ #1 [39389.800012] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014 [39389.800012] task: ffff880179294380 ti: ffff880034a60000 task.ti: ffff880034a60000 [39389.800012] RIP: 0010:[] [] queued_write_lock_slowpath+0x62/0x72 [39389.800012] RSP: 0018:ffff880034a639f0 EFLAGS: 00000206 [39389.800012] RAX: 0000000000000101 RBX: ffff8801710c4e98 RCX: 0000000000000000 [39389.800012] RDX: 00000000000000ff RSI: 0000000000000000 RDI: ffff8801710c4e9c [39389.800012] RBP: ffff880034a639f8 R08: 0000000000000001 R09: 0000000000000000 [39389.800012] R10: ffff880034a639b0 R11: 0000000000001000 R12: ffff8801710c4e98 [39389.800012] R13: 0000000000000001 R14: ffff880172cbc000 R15: ffff8801710c4e00 [39389.800012] FS: 00007f6d377fe700(0000) GS:ffff8802be9e0000(0000) knlGS:0000000000000000 [39389.800012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [39389.800012] CR2: 00007f6d3d3c1000 CR3: 0000000037c93000 CR4: 00000000001406e0 [39389.800012] Stack: [39389.800012] ffff8801710c4e98 ffff880034a63a10 ffffffff81091963 ffff8801710c4e98 [39389.800012] ffff880034a63a30 ffffffff81486f1b ffffffffa0672cb3 ffff8801710c4e00 [39389.800012] ffff880034a63a78 ffffffffa0672cb3 ffff8801710c4e00 ffff880034a63a58 [39389.800012] Call Trace: [39389.800012] [] do_raw_write_lock+0x72/0x8c [39389.800012] [] _raw_write_lock+0x3a/0x41 [39389.800012] [] ? btrfs_tree_lock+0x119/0x251 [btrfs] [39389.800012] [] btrfs_tree_lock+0x119/0x251 [btrfs] [39389.800012] [] ? rcu_read_unlock+0x5b/0x5d [btrfs] [39389.800012] [] ? btrfs_root_node+0xda/0xe6 [btrfs] [39389.800012] [] btrfs_lock_root_node+0x22/0x42 [btrfs] [39389.800012] [] btrfs_search_slot+0x1b8/0x758 [btrfs] [39389.800012] [] ? time_hardirqs_on+0x15/0x28 [39389.800012] [] btrfs_lookup_inode+0x31/0x95 [btrfs] [39389.800012] [] ? trace_hardirqs_on+0xd/0xf [39389.800012] [] ? mutex_lock_nested+0x397/0x3bc [39389.800012] [] __btrfs_update_delayed_inode+0x59/0x1c0 [btrfs] [39389.800012] [] __btrfs_commit_inode_delayed_items+0x194/0x5aa [btrfs] [39389.800012] [] ? _raw_spin_unlock+0x31/0x44 [39389.800012] [] __btrfs_run_delayed_items+0xa4/0x15c [btrfs] [39389.800012] [] btrfs_run_delayed_items+0x11/0x13 [btrfs] [39389.800012] [] btrfs_commit_transaction+0x234/0x96e [btrfs] [39389.800012] [] btrfs_sync_fs+0x145/0x1ad [btrfs] [39389.800012] [] btrfs_ioctl+0x11d2/0x2793 [btrfs] [39389.800012] [] ? arch_local_irq_save+0x9/0xc [39389.800012] [] ? __might_fault+0x4c/0xa7 [39389.800012] [] ? __might_fault+0x4c/0xa7 [39389.800012] [] ? arch_local_irq_save+0x9/0xc [39389.800012] [] ? rcu_read_unlock+0x3e/0x5d [39389.800012] [] do_vfs_ioctl+0x42b/0x4ea [39389.800012] [] ? __fget_light+0x62/0x71 [39389.800012] [] SyS_ioctl+0x57/0x79 [39389.800012] [] entry_SYSCALL_64_fastpath+0x12/0x6f [39389.800012] Code: f0 0f b1 13 85 c0 75 ef eb 2a f3 90 8a 03 84 c0 75 f8 f0 0f b0 13 84 c0 75 f0 ba ff 00 00 00 eb 0a f0 0f b1 13 ff c8 74 0b f3 90 <8b> 03 83 f8 01 75 f7 eb ed c6 43 04 00 5b 5d c3 0f 1f 44 00 00 This happens because in the code path executed by the inode_paths ioctl we end up nesting two calls to read lock a leaf's rwlock when after the first call to read_lock() and before the second call to read_lock(), another task (running the delayed items as part of a transaction commit) has already called write_lock() against the leaf's rwlock. This situation is illustrated by the following diagram: Task A Task B btrfs_ref_to_path() btrfs_commit_transaction() read_lock(&eb->lock); btrfs_run_delayed_items() __btrfs_commit_inode_delayed_items() __btrfs_update_delayed_inode() btrfs_lookup_inode() write_lock(&eb->lock); --> task waits for lock read_lock(&eb->lock); --> makes this task hang forever (and task B too of course) So fix this by avoiding doing the nested read lock, which is easily avoidable. This issue does not happen if task B calls write_lock() after task A does the second call to read_lock(), however there does not seem to exist anything in the documentation that mentions what is the expected behaviour for recursive locking of rwlocks (leaving the idea that doing so is not a good usage of rwlocks). Also, as a side effect necessary for this fix, make sure we do not needlessly read lock extent buffers when the input path has skip_locking set (used when called from send). Cc: stable@vger.kernel.org Signed-off-by: Filipe Manana Signed-off-by: Sasha Levin commit 422d2ab7e8cc67ed57d79a297f6832c74c0a0d80 Author: Nicholas Bellinger Date: Mon Jan 11 21:53:05 2016 -0800 target: Fix LUN_RESET active TMR descriptor handling [ Upstream commit a6d9bb1c9605cd4f44e2d8290dc4d0e88f20292d ] This patch fixes a NULL pointer se_cmd->cmd_kref < 0 refcount bug during TMR LUN_RESET with active TMRs, triggered during se_cmd + se_tmr_req descriptor shutdown + release via core_tmr_drain_tmr_list(). To address this bug, go ahead and obtain a local kref_get_unless_zero(&se_cmd->cmd_kref) for active I/O to set CMD_T_ABORTED, and transport_wait_for_tasks() followed by the final target_put_sess_cmd() to drop the local ->cmd_kref. Also add two new checks within target_tmr_work() to avoid CMD_T_ABORTED -> TFO->queue_tm_rsp() callbacks ahead of invoking the backend -> fabric put in transport_cmd_check_stop_to_fabric(). For good measure, also change core_tmr_release_req() to use list_del_init() ahead of se_tmr_req memory free. Reviewed-by: Quinn Tran Cc: Himanshu Madhani Cc: Sagi Grimberg Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Andy Grover Cc: Mike Christie Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit 6f04c5131ca6a50b8b7747eed79bf5efe9c51e8b Author: Bart Van Assche Date: Mon Apr 27 13:52:36 2015 +0200 target: Remove first argument of target_{get,put}_sess_cmd() [ Upstream commit afc16604c06414223478df3e42301ab630b9960a ] The first argument of these two functions is always identical to se_cmd->se_sess. Hence remove the first argument. Signed-off-by: Bart Van Assche Reviewed-by: Sagi Grimberg Reviewed-by: Christoph Hellwig Cc: Andy Grover Cc: Cc: Felipe Balbi Cc: Michael S. Tsirkin Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit 988ba6758fbc851c71505c3df18eb8af1e730fa7 Author: Vinod Koul Date: Mon Feb 1 22:26:40 2016 +0530 ASoC: dpcm: fix the BE state on hw_free [ Upstream commit 5e82d2be6ee53275c72e964507518d7964c82753 ] While performing hw_free, DPCM checks the BE state but leaves out the suspend state. The suspend state needs to be checked as well, as we might be suspended and then usermode closes rather than resuming the audio stream. This was found by a stress testing of system with playback in loop and killed after few seconds running in background and second script running suspend-resume test in loop Signed-off-by: Vinod Koul Acked-by: Liam Girdwood Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 992eab6ecc8255a005c6bfc9505eca9a8d08206a Author: zengtao Date: Tue Feb 2 11:38:34 2016 +0800 cputime: Prevent 32bit overflow in time[val|spec]_to_cputime() [ Upstream commit 0f26922fe5dc5724b1adbbd54b21bad03590b4f3 ] The datatype __kernel_time_t is u32 on 32bit platform, so its subject to overflows in the timeval/timespec to cputime conversion. Currently the following functions are affected: 1. setitimer() 2. timer_create/timer_settime() 3. sys_clock_nanosleep This can happen on MIPS32 and ARM32 with "Full dynticks CPU time accounting" enabled, which is required for CONFIG_NO_HZ_FULL. Enforce u64 conversion to prevent the overflow. Fixes: 31c1fc818715 ("ARM: Kconfig: allow full nohz CPU accounting") Signed-off-by: zengtao Reviewed-by: Arnd Bergmann Cc: Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1454384314-154784-1-git-send-email-prime.zeng@huawei.com Signed-off-by: Thomas Gleixner Signed-off-by: Sasha Levin commit 35eacd10a054fb9658f8c15edd5cac19a476db9d Author: James Hogan Date: Mon Jan 25 20:32:03 2016 +0000 MIPS: Fix buffer overflow in syscall_get_arguments() [ Upstream commit f4dce1ffd2e30fa31756876ef502ce6d2324be35 ] Since commit 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)"), syscall_get_arguments() attempts to handle o32 indirect syscall arguments by incrementing both the start argument number and the number of arguments to fetch. However only the start argument number needs to be incremented. The number of arguments does not change, they're just shifted up by one, and in fact the output array is provided by the caller and is likely only n entries long, so reading more arguments overflows the output buffer. In the case of seccomp, this results in it fetching 7 arguments starting at the 2nd one, which overflows the unsigned long args[6] in populate_seccomp_data(). This clobbers the $s0 register from syscall_trace_enter() which __seccomp_phase1_filter() saved onto the stack, into which syscall_trace_enter() had placed its syscall number argument. This caused Chromium to crash. Credit goes to Milko for tracking it down as far as $s0 being clobbered. Fixes: 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)") Reported-by: Milko Leporis Signed-off-by: James Hogan Cc: linux-mips@linux-mips.org Cc: # 3.15- Patchwork: https://patchwork.linux-mips.org/patch/12213/ Signed-off-by: Ralf Baechle Signed-off-by: Sasha Levin commit 3fa9638c67f6e7f3c4a242ecb21eaacda9dcff91 Author: Tejun Heo Date: Mon Feb 1 11:33:21 2016 -0500 libata: fix sff host state machine locking while polling [ Upstream commit 8eee1d3ed5b6fc8e14389567c9a6f53f82bb7224 ] The bulk of ATA host state machine is implemented by ata_sff_hsm_move(). The function is called from either the interrupt handler or, if polling, a work item. Unlike from the interrupt path, the polling path calls the function without holding the host lock and ata_sff_hsm_move() selectively grabs the lock. This is completely broken. If an IRQ triggers while polling is in progress, the two can easily race and end up accessing the hardware and updating state machine state at the same time. This can put the state machine in an illegal state and lead to a crash like the following. kernel BUG at drivers/ata/libata-sff.c:1302! invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN Modules linked in: CPU: 1 PID: 10679 Comm: syz-executor Not tainted 4.5.0-rc1+ #300 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88002bd00000 ti: ffff88002e048000 task.ti: ffff88002e048000 RIP: 0010:[] [] ata_sff_hsm_move+0x619/0x1c60 ... Call Trace: [] __ata_sff_port_intr+0x1e1/0x3a0 drivers/ata/libata-sff.c:1584 [] ata_bmdma_port_intr+0x71/0x400 drivers/ata/libata-sff.c:2877 [< inline >] __ata_sff_interrupt drivers/ata/libata-sff.c:1629 [] ata_bmdma_interrupt+0x253/0x580 drivers/ata/libata-sff.c:2902 [] handle_irq_event_percpu+0x108/0x7e0 kernel/irq/handle.c:157 [] handle_irq_event+0xa7/0x140 kernel/irq/handle.c:205 [] handle_edge_irq+0x1e3/0x8d0 kernel/irq/chip.c:623 [< inline >] generic_handle_irq_desc include/linux/irqdesc.h:146 [] handle_irq+0x10c/0x2a0 arch/x86/kernel/irq_64.c:78 [] do_IRQ+0x7d/0x1a0 arch/x86/kernel/irq.c:240 [] common_interrupt+0x8c/0x8c arch/x86/entry/entry_64.S:520 [< inline >] rcu_lock_acquire include/linux/rcupdate.h:490 [< inline >] rcu_read_lock include/linux/rcupdate.h:874 [] filemap_map_pages+0x131/0xba0 mm/filemap.c:2145 [< inline >] do_fault_around mm/memory.c:2943 [< inline >] do_read_fault mm/memory.c:2962 [< inline >] do_fault mm/memory.c:3133 [< inline >] handle_pte_fault mm/memory.c:3308 [< inline >] __handle_mm_fault mm/memory.c:3418 [] handle_mm_fault+0x2516/0x49a0 mm/memory.c:3447 [] __do_page_fault+0x376/0x960 arch/x86/mm/fault.c:1238 [] trace_do_page_fault+0xe8/0x420 arch/x86/mm/fault.c:1331 [] do_async_page_fault+0x14/0xd0 arch/x86/kernel/kvm.c:264 [] async_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:986 Fix it by ensuring that the polling path is holding the host lock before entering ata_sff_hsm_move() so that all hardware accesses and state updates are performed under the host lock. Signed-off-by: Tejun Heo Reported-and-tested-by: Dmitry Vyukov Link: http://lkml.kernel.org/g/CACT4Y+b_JsOxJu2EZyEf+mOXORc_zid5V1-pLZSroJVxyWdSpw@mail.gmail.com Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit abd62631a46c8a27fc607ea5525d3034e8d69781 Author: Dan Carpenter Date: Tue Jan 26 12:24:25 2016 +0300 intel_scu_ipcutil: underflow in scu_reg_access() [ Upstream commit b1d353ad3d5835b16724653b33c05124e1b5acf1 ] "count" is controlled by the user and it can be negative. Let's prevent that by making it unsigned. You have to have CAP_SYS_RAWIO to call this function so the bug is not as serious as it could be. Fixes: 5369c02d951a ('intel_scu_ipc: Utility driver for intel scu ipc') Signed-off-by: Dan Carpenter Cc: stable@vger.kernel.org Signed-off-by: Darren Hart Signed-off-by: Sasha Levin commit ce4f0721444eea05c31c072981a165750b8b243b Author: Alexei Potashnik Date: Tue Jul 14 16:00:49 2015 -0400 qla2xxx: terminate exchange when command is aborted by LIO [ Upstream commit 7359df25a53386dd33c223672bbd12cb49d0ce4f ] The newly introduced aborted_task TFO callback has to terminate exchange with QLogic driver, since command is being deleted and no status will be queued to the driver at a later point. This patch also moves the burden of releasing one cmd refcount to the aborted_task handler. Changed iSCSI aborted_task logic to satisfy the above requirement. Cc: # v3.18+ Signed-off-by: Alexei Potashnik Acked-by: Quinn Tran Signed-off-by: Himanshu Madhani Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit eaaecc5e28b49dc00c4c36f1f5fd1acf1f4f4be1 Author: Alexei Potashnik Date: Tue Jul 14 16:00:46 2015 -0400 qla2xxx: added sess generations to detect RSCN update races [ Upstream commit df673274fa4896f25f0bf348d2a3535d74b4cbec ] RSCN processing in qla2xxx driver can run in parallel with ELS/IO processing. As such the decision to remove disappeared fc port's session could be stale, because a new login sequence has occurred since and created a brand new session. Previous mechanism of dealing with this by delaying deletion request was prone to erroneous deletions if the event that was supposed to cancel the deletion never arrived or has been delayed in processing. New mechanism relies on a time-like generation counter to serialize RSCN updates relative to ELS/IO updates. Cc: # v3.18+ Signed-off-by: Alexei Potashnik Signed-off-by: Himanshu Madhani Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit 1226aefc12eef8b339bdaff0bfe11d72eb4978df Author: Alexei Potashnik Date: Tue Jul 14 16:00:45 2015 -0400 qla2xxx: Abort stale cmds on qla_tgt_wq when plogi arrives [ Upstream commit daddf5cf9b5c68b81b2bb7133f1dd0fda4552d0b ] cancel any commands from initiator's s_id that are still waiting on qla_tgt_wq when PLOGI arrives. Cc: # v3.18+ Signed-off-by: Alexei Potashnik Acked-by: Quinn Tran Signed-off-by: Himanshu Madhani Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit 772d56fecb6dde547473cf7b556f457555897da7 Author: Alexei Potashnik Date: Tue Jul 14 16:00:48 2015 -0400 qla2xxx: drop cmds/tmrs arrived while session is being deleted [ Upstream commit e52a8b45b9c782937f74b701f8c656d4e5619eb5 ] If a new initiator (different WWN) shows up on the same fcport, old initiator's session is scheduled for deletion. But there is a small window between it being marked with QLA_SESS_DELETION_IN_PROGRESS and qlt_unret_sess getting called when new session's commands will keep finding old session in the fcport map. This patch drops cmds/tmrs if they find session in the progress of being deleted. Cc: # v3.18+ Signed-off-by: Alexei Potashnik Acked-by: Quinn Tran Signed-off-by: Himanshu Madhani Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit f89ace487f87aa90b7426ec94872cc38ee11fd1e Author: Alexei Potashnik Date: Tue Jul 14 16:00:44 2015 -0400 qla2xxx: delay plogi/prli ack until existing sessions are deleted [ Upstream commit a6ca88780dd66b0700d89419abd17b6b4bb49483 ] - keep qla_tgt_sess object on the session list until it's freed - modify use of sess->deleted flag to differentiate delayed session deletion that can be cancelled from irreversible one: QLA_SESS_DELETION_PENDING vs QLA_SESS_DELETION_IN_PROGRESS - during IN_PROGRESS deletion all newly arrived commands and TMRs will be rejected, existing commands and TMRs will be terminated when given by the core to the fabric or simply dropped if session logout has already happened (logout terminates all existing exchanges) - new PLOGI will initiate deletion of the following sessions (unless deletion is already IN_PROGRESS): - with the same port_name (with logout) - different port_name, different loop_id but the same port_id (with logout) - different port_name, different port_id, but the same loop_id (without logout) - additionally each new PLOGI will store imm notify iocb in the same port_name session being deleted. When deletion process completes this iocb will be acked. Only the most recent PLOGI iocb is stored. The older ones will be terminated when replaced. - new PRLI will initiate deletion of the following sessions (unless deletion is already IN_PROGRESS): - different port_name, different port_id, but the same loop_id (without logout) Cc: # v3.18+ Signed-off-by: Alexei Potashnik Acked-by: Quinn Tran Signed-off-by: Himanshu Madhani Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit ea116368a3680fc4ea10196736adae2109f96301 Author: Swapnil Nagle Date: Tue Jul 14 16:00:43 2015 -0400 qla2xxx: cleanup cmd in qla workqueue before processing TMR [ Upstream commit 8b2f5ff3d05c2c48b722c3cc67b8226f1601042b ] Since cmds go into qla_tgt_wq and TMRs don't, it's possible that TMR like TASK_ABORT can be queued over the cmd for which it was meant. To avoid this race, use a per-port list to keep track of cmds that are enqueued to qla_tgt_wq but not yet processed. When a TMR arrives, iterate through this list and remove any cmds that match the TMR. This patch supports TASK_ABORT and LUN_RESET. Cc: # v3.18+ Signed-off-by: Swapnil Nagle Signed-off-by: Alexei Potashnik Acked-by: Quinn Tran Signed-off-by: Himanshu Madhani Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit f5fa85b4e599a45e6d056d44eb920eb42ed63849 Author: Dmitry Torokhov Date: Sat Jan 16 10:04:49 2016 -0800 Input: vmmouse - fix absolute device registration [ Upstream commit d4f1b06d685d11ebdaccf11c0db1cb3c78736862 ] We should set device's capabilities first, and then register it, otherwise various handlers already present in the kernel will not be able to connect to the device. Reported-by: Lauri Kasanen Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin commit c77c01982668678a5fafe3036be4e757b7eddbaf Author: Tejun Heo Date: Fri Jan 15 15:13:05 2016 -0500 libata: disable forced PORTS_IMPL for >= AHCI 1.3 [ Upstream commit 566d1827df2ef0cbe921d3d6946ac3007b1a6938 ] Some early controllers incorrectly reported zero ports in PORTS_IMPL register and the ahci driver fabricates PORTS_IMPL from the number of ports in those cases. This hasn't mattered but with the new nvme controllers there are cases where zero PORTS_IMPL is valid and should be honored. Disable the workaround for >= AHCI 1.3. Signed-off-by: Tejun Heo Reported-by: Andy Lutomirski Link: http://lkml.kernel.org/g/CALCETrU7yMvXEDhjAUShoHEhDwifJGapdw--BKxsP0jmjKGmRw@mail.gmail.com Cc: Sergei Shtylyov Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 1e7516689817b56f78dc60497a4dd7c80319417e Author: Sebastian Andrzej Siewior Date: Mon Jan 25 10:08:00 2016 -0600 PCI/AER: Flush workqueue on device remove to avoid use-after-free [ Upstream commit 4ae2182b1e3407de369f8c5d799543b7db74221b ] A Root Port's AER structure (rpc) contains a queue of events. aer_irq() enqueues AER status information and schedules aer_isr() to dequeue and process it. When we remove a device, aer_remove() waits for the queue to be empty, then frees the rpc struct. But aer_isr() references the rpc struct after dequeueing and possibly emptying the queue, which can cause a use-after-free error as in the following scenario with two threads, aer_isr() on the left and a concurrent aer_remove() on the right: Thread A Thread B -------- -------- aer_irq(): rpc->prod_idx++ aer_remove(): wait_event(rpc->prod_idx == rpc->cons_idx) # now blocked until queue becomes empty aer_isr(): # ... rpc->cons_idx++ # unblocked because queue is now empty ... kfree(rpc) mutex_unlock(&rpc->rpc_mutex) To prevent this problem, use flush_work() to wait until the last scheduled instance of aer_isr() has completed before freeing the rpc struct in aer_remove(). I reproduced this use-after-free by flashing a device FPGA and re-enumerating the bus to find the new device. With SLUB debug, this crashes with 0x6b bytes (POISON_FREE, the use-after-free magic number) in GPR25: pcieport 0000:00:00.0: AER: Multiple Corrected error received: id=0000 Unable to handle kernel paging request for data at address 0x27ef9e3e Workqueue: events aer_isr GPR24: dd6aa000 6b6b6b6b 605f8378 605f8360 d99b12c0 604fc674 606b1704 d99b12c0 NIP [602f5328] pci_walk_bus+0xd4/0x104 [bhelgaas: changelog, stable tag] Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Bjorn Helgaas CC: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 2b65014b4a1083a8715f3425b292723d9729e3d0 Author: Tejun Heo Date: Thu Jan 21 15:31:11 2016 -0500 cgroup: make sure a parent css isn't offlined before its children [ Upstream commit aa226ff4a1ce79f229c6b7a4c0a14e17fececd01 ] There are three subsystem callbacks in css shutdown path - css_offline(), css_released() and css_free(). Except for css_released(), cgroup core didn't guarantee the order of invocation. css_offline() or css_free() could be called on a parent css before its children. This behavior is unexpected and led to bugs in cpu and memory controller. This patch updates offline path so that a parent css is never offlined before its children. Each css keeps online_cnt which reaches zero iff itself and all its children are offline and offline_css() is invoked only after online_cnt reaches zero. This fixes the memory controller bug and allows the fix for cpu controller. Signed-off-by: Tejun Heo Reported-and-tested-by: Christian Borntraeger Reported-by: Brian Christiansen Link: http://lkml.kernel.org/g/5698A023.9070703@de.ibm.com Link: http://lkml.kernel.org/g/CAKB58ikDkzc8REt31WBkD99+hxNzjK4+FBmhkgS+NVrC9vjMSg@mail.gmail.com Cc: Heiko Carstens Cc: Peter Zijlstra Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 13a16b9e3907607e4a324373362701c26f6e1cea Author: Tejun Heo Date: Wed May 13 15:38:40 2015 -0400 cgroup: separate out include/linux/cgroup-defs.h [ Upstream commit b4a04ab7a37b490cad48e69abfe14288cacb669c ] From 2d728f74bfc071df06773e2fd7577dd5dab6425d Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Wed, 13 May 2015 15:37:01 -0400 This patch separates out cgroup-defs.h from cgroup.h which has grown a lot of dependencies. cgroup-defs.h currently only contains constant and type definitions and can be used to break circular include dependency. While moving, definitions are reordered so that cgroup-defs.h has consistent logical structure. This patch is pure reorganization. Signed-off-by: Tejun Heo Signed-off-by: Sasha Levin commit 69cc9251b85d6ddfd8e94b6be60aacdceec98b22 Author: Bard Liao Date: Thu Jan 21 13:13:40 2016 +0800 ASoC: rt5645: fix the shift bit of IN1 boost [ Upstream commit b28785fa9cede0d4f47310ca0dd2a4e1d50478b5 ] The shift bit of IN1 boost gain control is 12. Signed-off-by: Bard Liao Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 789a2e5a69bc3dd5197b8f51451359d06c4ac8a1 Author: CQ Tang Date: Wed Jan 13 21:15:03 2016 +0000 iommu/vt-d: Fix 64-bit accesses to 32-bit DMAR_GSTS_REG [ Upstream commit fda3bec12d0979aae3f02ee645913d66fbc8a26e ] This is a 32-bit register. Apparently harmless on real hardware, but causing justified warnings in simulation. Signed-off-by: CQ Tang Signed-off-by: David Woodhouse Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin