Open Bug 526587 (PoisonFrameCrash) Opened 15 years ago Updated 2 years ago

[meta] Crashes at the Poison Frame address

Categories

(Core :: Layout, defect)

defect

Tracking

()

People

(Reporter: chofmann, Unassigned)

References

(Depends on 2 open bugs)

Details

(4 keywords, Whiteboard: [sg:nse meta][crashkill])

Attachments

(11 files)

looks like frame poisoning is exposing crash bugs in a variety of areas.

here is a quick list to start looking at 
 
 29 nsAccessibilityService::GetAccessible(nsIDOMNode*, nsIPresShell*, nsIWeakReference*, nsIFrame**, int*, nsIAccessible**) Windows NT 3.6b1 0xfffffffff0dea813

tracked in Bug 525579 


  11 nsAccessibleTreeWalker::PopState() Windows NT 3.6b1 0xfffffffff0dea81b
   4 nsAccessibilityService::GetAccessible(nsIDOMNode*, nsIPresShell*, nsIWeakReference*, nsIFrame**, int*, nsIAccessible**) Windows NT 3.6b2pre 0xfffffffff0dea813
   3 nsStyleContext::GetStyleDisplay() Windows NT 3.6b1 0xfffffffff0dea81f
   2 nsFrameManager::RemoveFrame(nsIFrame*, nsIAtom*, nsIFrame*) Windows NT 3.7a1pre 0xfffffffff0dea7ff
   2 nsCSSFrameConstructor::MaybeRecreateContainerForFrameRemoval(nsIFrame*, unsigned int*) Windows NT 3.7a1pre 0xfffffffff0dea91b
   2 nsCOMPtr_base::assign_from_qi(nsQueryInterface, nsID const&) Windows NT 3.6b1 0xfffffffff0dea7ff
   2 nsAccessibilityService::GetAccessible(nsIDOMNode*, nsIPresShell*, nsIWeakReference*, nsIFrame**, int*, nsIAccessible**) Windows NT 3.7a1pre 0xfffffffff0dea813
   1 xul.dll@0x42d9d7 Windows NT 3.6b1 0xfffffffff0dea807
   1 nsObjectLoadingContent::Instantiate(nsIObjectFrame*, nsACString_internal const&, nsIURI*) Windows NT 3.6b1 0xfffffffff0dea803
   1 nsIFrame::SetNextSibling(nsIFrame*) Windows NT 3.7a1pre 0xfffffffff0dea823
   1 nsIFrame::InvalidateInternal(nsRect const&, int, int, nsIFrame*, unsigned int) Windows NT 3.7a1pre 0xfffffffff0dea933
   1 nsIFrame::GetStyleDisplay() Windows NT 3.6b1 0xfffffffff0dea817
   1 nsIFrame::GetOffsetTo(nsIFrame const*) Windows NT 3.6b1 0xfffffffff0dea803
   1 do_QueryFrame::operator<nsIScrollableFrame> nsIScrollableFrame*() Windows NT 3.7a1pre 0xfffffffff0dea7ff
   1 PresShell::CreateRenderingContext(nsIFrame*, nsIRenderingContext**) Windows NT 3.7a1pre 0xfffffffff0dea817


talking with dveditz we might want to keep this bug close because these might be exploitable crashes in older releases that don't have frame poisoning.
Group: core-security
Depends on: 25579
Whiteboard: [crashkill]

here is query for #4  nsStyleContext::GetStyleDisplay() on the list above.  that stack signature and this query contains a combination of older crashes from past releases, and new ones related to frame poisoning.

http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.6b1&query_search=signature&query_type=exact&query=nsStyleContext%3A%3AGetStyleDisplay%28%29&date=&range_value=1&range_unit=weeks&do_query=1&signature=nsStyleContext%3A%3AGetStyleDisplay%28%29
re comment 1:

spun off to  Bug 526592 

all the bugs on the dependency tree of this bug will show up as crash regressions in 3.6 when it ships so we might want to take and bake as many of these fixes as we can.
Depends on: 525579, 526592
No longer depends on: 25579
Flags: wanted1.9.2?
Keywords: regression
Keywords: meta
as the number of users ramps on the beta it might be likely that more code
paths get exposed to this problem so we should keep an eye out for additions to
the list in comment 0

bclary, here is what I used to get that list started.

% grep 0xfffffffff0de  20091103-crashdata.csv | awk -F\t '{print
$1,$11,$8,$14}' | sort | uniq -c | sort -nr | more

we might also need to refine the grep search term to catch more frame poisoned
crash addresses
This doesn't exactly surprise me; the whole point of frame poisoning was, we knew there were a ton of use-after-free bugs in the layout engine, and we wanted to make sure that none of them were exploitable.  But some of them were silent, because the code using the freed data was finding something not entirely unlike what it would have found if the data hadn't been free yet, and carrying on.  Now most of 'em are going to be prompt crashers.

As I said in bug 525579, the fix is, in general, to find the code that freed an object that still had outstanding references and make it not do that, or else to make those references go away.

I'd suggest a slight tweak to the grep pattern right now:

% egrep '0x7?f+f0de'

because x86-64 builds use a leading 7 to put the poison address in a region that's invalid at the hardware level (so we don't have to worry about what the OS might do).  Shortly there is going to be code for all 32-bit Mac builds and 32-on-64 Linux and Windows that reserves the poison region at startup (there is no guaranteed-reserved address region on those OSes), though, and then there will not be a simple grep pattern anymore.
ok, here is an update list using all the crash data from oct/nov and the new grep patern

egrep '0x7?f+f0de'  200910* 200911* | cut -c24- |  awk -F\t '{print $14,$11,$8,"\t",$1}' | sort | uniq -c | sort -nr | awk '{printf "%s. %s\n",NR,$0}'

rank #crashes  address platform release  signature

1.  240 0xfffffffff0dea813 Windows NT 3.6b1 	 nsAccessibilityService::GetAccessible(nsIDOMNode*, nsIPresShell*, nsIWeakReference*, nsIFrame**, int*, nsIAccessible**)
2.   59 0xfffffffff0dea81b Windows NT 3.6b1 	 nsAccessibleTreeWalker::PopState()
3.   55 0xfffffffff0dea813 Windows NT 3.7a1pre 	 nsAccessibilityService::GetAccessible(nsIDOMNode*, nsIPresShell*, nsIWeakReference*, nsIFrame**, int*, nsIAccessible**)
4.   32 0xfffffffff0dea7ff Windows NT 3.7a1pre 	 do_QueryFrame::operator<nsIScrollableFrame> nsIScrollableFrame*()
5.   30 0xfffffffff0dea81f Windows NT 3.6b1 	 nsStyleContext::GetStyleDisplay()
6.   21 0xfffffffff0dea817 Windows NT 3.7a1pre 	 nsIFrame::GetClosestView(nsPoint*)
7.   19 0xfffffffff0dea91b Windows NT 3.7a1pre 	 nsCSSFrameConstructor::MaybeRecreateContainerForFrameRemoval(nsIFrame*, unsigned int*)
8.   17 0xfffffffff0dea933 Windows NT 3.7a1pre 	 nsIFrame::InvalidateInternal(nsRect const&, int, int, nsIFrame*, unsigned int)
9.   15 0xfffffffff0dea813 Windows NT 3.6b2pre 	 nsAccessibilityService::GetAccessible(nsIDOMNode*, nsIPresShell*, nsIWeakReference*, nsIFrame**, int*, nsIAccessible**)
10.   14 0xfffffffff0dea823 Windows NT 3.7a1pre 	 nsIFrame::SetNextSibling(nsIFrame*)
11.   10 0xfffffffff0dea81b Windows NT 3.7a1pre 	 nsAccessibleTreeWalker::PopState()
12.   10 0xfffffffff0dea813 Windows NT 3.6b1pre 	 nsAccessibilityService::GetAccessible(nsIDOMNode*, nsIPresShell*, nsIWeakReference*, nsIFrame**, int*, nsIAccessible**)
13.   10 0xfffffffff0dea7ff Windows NT 3.6b1 	 nsCOMPtr_base::assign_from_qi(nsQueryInterface, nsID const&)
14.    9 0xfffffffff0dea803 Windows NT 3.7a1pre 	 nsObjectLoadingContent::Instantiate(nsIObjectFrame*, nsACString_internal const&, nsIURI*)
15.    8 0xfffffffff0dea80f Windows NT 3.7a1pre 	 nsINode::HasSlots()
16.    7 0xfffffffff0dea80f Windows NT 3.7a1pre 	 nsCSSFrameConstructor::FindFrameWithContent(nsFrameManager*, nsIFrame*, nsIContent*, nsIContent*, nsFindFrameHint*)
17.    6 0xfffffffff0dea7ff Windows NT 3.6b1 	 nsIContent::SetText(nsAString_internal const&, int)
18.    5 0xfffffffff0dea803 Windows NT 3.6b1 	 nsObjectLoadingContent::Instantiate(nsIObjectFrame*, nsACString_internal const&, nsIURI*)
19.    4 0xfffffffff0dea81f Windows NT 3.7a1pre 	 nsStyleContext::GetStyleDisplay()
20.    4 0xfffffffff0dea817 Windows NT 3.6b1 	 nsDisplayText::Paint(nsDisplayListBuilder*, nsIRenderingContext*)
21.    3 0xfffffffff0def0e1 Windows NT 3.0.14 	 @0xf0def0e1
22.    3 0xfffffffff0dea823 Mac OS X 3.7a1pre 	 nsFrameList::InsertFrames(nsIFrame*, nsIFrame*, nsFrameList&)
23.    3 0xfffffffff0dea817 Windows NT 3.6b1 	 nsIFrame::GetStyleDisplay()
24.    3 0xfffffffff0dea7ff Windows NT 3.7a1pre 	 nsFrameManager::RemoveFrame(nsIFrame*, nsIAtom*, nsIFrame*)
25.    3 0xfffffffff0dea7ff Windows NT 3.7a1pre 	 nsCOMPtr_base::assign_with_AddRef(nsISupports*)
26.    2 0xfffffffff0dea91b Windows NT 3.7a1pre 	 nsBidiPresUtils::Resolve(nsBlockFrame*, int)
27.    2 0xfffffffff0dea893 Windows NT 3.7a1pre 	 nsLayoutUtils::GetNextContinuationOrSpecialSibling(nsIFrame*)
28.    2 0xfffffffff0dea81f Windows NT 3.7a1pre 	 nsCachedStyleData::GetStyleTextReset()
29.    2 0xfffffffff0dea81f Windows NT 3.6b2pre 	 nsAccessibleTreeWalker::UpdateFrame(int)
30.    2 0xfffffffff0dea81f Windows NT 3.6b1 	 @0x0
31.    2 0xfffffffff0dea81b Windows NT 3.6b2pre 	 nsAccessibleTreeWalker::PopState()
32.    2 0xfffffffff0dea817 Windows NT 3.7a1pre 	 PresShell::CreateRenderingContext(nsIFrame*, nsIRenderingContext**)
33.    2 0xfffffffff0dea80b Windows NT 3.6b1 	 nsStyleContext::GetStyleDisplay()
34.    2 0xfffffffff0dea807 Windows NT 3.7a1pre 	 nsINode::GetOwnerDoc()
35.    2 0xfffffffff0dea803 Windows NT 3.6b1 	 nsIFrame::GetOffsetTo(nsIFrame const*)
36.    2 0xfffffffff0dea803 Windows NT 3.6b1 	 nsDisplayBackground::Paint(nsDisplayListBuilder*, nsIRenderingContext*)
37.    2 0xfffffffff0dea7ff Windows NT 3.7a1pre 	 do_QueryFrame::operator<nsBlockFrame> nsBlockFrame*()
38.    2 0xfffffffff0dea7ff Windows NT 3.6b2pre 	 nsCOMPtr_base::assign_from_qi(nsQueryInterface, nsID const&)
39.    2 0xfffffffff0dea7ff Windows NT 3.6b1 	 nsCOMPtr_base::assign_with_AddRef(nsISupports*)
40.    2 0xfffffffff0dea7ff Windows NT 3.6b1 	 nsAsyncInstantiateEvent::Run()
41.    2 0xfffffffff0dea7ff Mac OS X 3.7a1pre 	 nsFrame::HandleDrag(nsPresContext*, nsGUIEvent*, nsEventStatus*)
42.    1 0xfffffffffff0decc Windows NT 3.5.3 	 NPSWF32.dll@0xe6bb
43.    1 0xffffffffff0de04c Windows NT 3.5.3 	 nsINode::GetOwnerDoc()
44.    1 0xfffffffff0def0e1 Windows NT 3.0.14 	 nsCOMPtr_base::~nsCOMPtr_base()
45.    1 0xfffffffff0dea91b Windows NT 3.7a1pre 	 nsPlaceholderFrame::GetRealFrameFor(nsIFrame*)
46.    1 0xfffffffff0dea857 Windows NT 3.7a1pre 	 xul.dll@0x308d86
47.    1 0xfffffffff0dea857 Windows NT 3.7a1pre 	 nsIFrame::GetFirstChild(nsIAtom*)
48.    1 0xfffffffff0dea857 Windows NT 3.7a1pre 	 nsCSSFrameConstructor::FindFrameWithContent(nsFrameManager*, nsIFrame*, nsIContent*, nsIContent*, nsFindFrameHint*)
49.    1 0xfffffffff0dea81f Windows NT 3.7a1pre 	 nsCachedStyleData::GetStyleUIReset()
50.    1 0xfffffffff0dea81f Windows NT 3.7a1pre 	 nsAccessibleTreeWalker::UpdateFrame(int)
51.    1 0xfffffffff0dea81f Windows NT 3.6b1 	 nsAccessibleTreeWalker::UpdateFrame(int)
52.    1 0xfffffffff0dea81f Windows NT 3.6b1 	 PresShell::Paint(nsIView*, nsIRenderingContext*, nsRegion const&)
53.    1 0xfffffffff0dea81f Windows NT 3.6b1 	 @0x1
54.    1 0xfffffffff0dea817 Windows NT 3.7a1pre 	 nsWeakFrame::Init(nsIFrame*)
55.    1 0xfffffffff0dea817 Windows NT 3.6b1 	 xul.dll@0x37f9c0
56.    1 0xfffffffff0dea817 Mac OS X 3.7a1pre 	 nsFrame::Destroy()
57.    1 0xfffffffff0dea813 Windows NT 3.7a1pre 	 xul.dll@0x818a7a
58.    1 0xfffffffff0dea813 Windows NT 3.7a1pre 	 nsCSSFrameConstructor::MaybeRecreateContainerForFrameRemoval(nsIFrame*, unsigned int*)
59.    1 0xfffffffff0dea813 Windows NT 3.6b1 	 @0x0
60.    1 0xfffffffff0dea80f Mac OS X 3.7a1pre 	 nsCSSFrameConstructor::FindFrameWithContent(nsFrameManager*, nsIFrame*, nsIContent*, nsIContent*, nsFindFrameHint*)
61.    1 0xfffffffff0dea80b Windows NT 3.6b1 	 nsRuleNode::DestroyInternal(nsRuleNode***)
62.    1 0xfffffffff0dea807 Windows NT 3.6b1 	 xul.dll@0x42d9d7
63.    1 0xfffffffff0dea803 Windows NT 3.7a1pre 	 nsIFrame::GetOffsetTo(nsIFrame const*)
64.    1 0xfffffffff0dea803 Windows NT 3.6b1 	 nsIFrame::BuildDisplayListForChild(nsDisplayListBuilder*, nsIFrame*, nsRect const&, nsDisplayListSet const&, unsigned int)
65.    1 0xfffffffff0dea7ff Windows NT 3.7a1pre 	 xul.dll@0x2f4054
66.    1 0xfffffffff0dea7ff Windows NT 3.6b1pre 	 nsCOMPtr_base::assign_with_AddRef(nsISupports*)
67.    1 0xfffffffff0dea7ff Windows NT 3.6b1 	 nsIFrame::GetView()
68.    1 0xfffffffff0dea7ff Windows NT 3.6b1 	 PresShell::FreeMisc(unsigned int, void*)
69.    1 0xfffffffff0dea7ff Windows NT 3.6b1 	 HashString(nsAString_internal const&)
70.    1 0xfffffffff0dea7ff Mac OS X 3.7a1pre 	 nsFrameManager::RemoveFrame(nsIFrame*, nsIAtom*, nsIFrame*)
71.    1 0xfffffffff0de7820 Windows NT 3.5.3 	 NPSWF32.dll@0x1ddb43
72.    1 0xfffffffff0de6e44 Windows NT 3.0.14 	 hash_access
Depends on: 526767
Depends on: 526769
Depends on: 526774
Depends on: 526853
re: comment #6 will start the Testrun for 1.9.2 today !
Whiteboard: [crashkill] → [sg:nse meta][crashkill]
(In reply to comment #4)
> This doesn't exactly surprise me; the whole point of frame poisoning was

Nor us, and it's great for upcoming releases. But we do need to look for these because 1) they might become topcrashers in 1.9.2 with poisoning, and 2) these become a red-flag for anyone who knows we've done poisoning and wants to hack the millions of people who haven't yet upgraded to releases we haven't shipped yet :-)
I just realized that '0x7?f+f0de' is wrong, it won't match '0xf0de....' which is what you get with 32-bit pointers.  '0x7?f+0de' would be better.

> ... we do need to look for these
> because 1) they might become topcrashers in 1.9.2 with poisoning, and 2) these
> become a red-flag for anyone who knows we've done poisoning and wants to hack
> the millions of people who haven't yet upgraded to releases we haven't shipped
> yet :-)

Oh, entirely agreed.

For the record, I think it *would* be possible to backport the QueryFrame stuff + the poisoning stuff to a 3.5.x security release without breaking anything; but it would be a large, tedious project that I personally do not want to do.  Hopefully 3.6 uptake is quick enough that we don't need it.
it doesn't look like the change between '0x7?f+f0de' and '0x7?f+f0de' picked  any additional signatures.  still at 59 across any crash in 3.6x or 3.7x.  here is the consolidated list.  we have bugs on file and hooked in to dependency for the top 6.  tomcats testing should be running and give us results if any of these have click-to-crash urls.  help is still need to file more bugs and develop crash test cases for anything on the dependency list.

chofmann$ egrep '0x7?f+0de'  200910* 200911* | cut -c24- |  awk -F\t '$8 ~ /3.[67]/{print $14,$11,"\t",$1}' | sort | uniq -c | sort -nr | awk '{printf "%s. %s\n",NR,$0}' 


1.  519 0xfffffffff0dea813 Windows NT 	 nsAccessibilityService::GetAccessible(nsIDOMNode*, nsIPresShell*, nsIWeakReference*, nsIFrame**, int*, nsIAccessible**)

bug 525579

2.  107 0xfffffffff0dea81b Windows NT 	 nsAccessibleTreeWalker::PopState()

bug 526767

3.   54 0xfffffffff0dea81f Windows NT 	 nsStyleContext::GetStyleDisplay()

bug 526592


4.   36 0xfffffffff0dea7ff Windows NT 	 do_QueryFrame::operator<nsIScrollableFrame> nsIScrollableFrame*()

bug 526769


5.   21 0xfffffffff0dea91b Windows NT 	 nsCSSFrameConstructor::MaybeRecreateContainerForFrameRemoval(nsIFrame*, unsigned int*)

bug 526853


6.   21 0xfffffffff0dea817 Windows NT 	 nsIFrame::GetClosestView(nsPoint*)

bug 526774

7.   20 0xfffffffff0dea933 Windows NT 	 nsIFrame::InvalidateInternal(nsRect const&, int, int, nsIFrame*, unsigned int)
8.   17 0xfffffffff0dea823 Windows NT 	 nsIFrame::SetNextSibling(nsIFrame*)
9.   15 0xfffffffff0dea803 Windows NT 	 nsObjectLoadingContent::Instantiate(nsIObjectFrame*, nsACString_internal const&, nsIURI*)
10.   15 0xfffffffff0dea7ff Windows NT 	 nsCOMPtr_base::assign_from_qi(nsQueryInterface, nsID const&)
11.    9 0xfffffffff0dea7ff Windows NT 	 nsCOMPtr_base::assign_with_AddRef(nsISupports*)
12.    8 0xfffffffff0dea80f Windows NT 	 nsINode::HasSlots()
13.    8 0xfffffffff0dea7ff Windows NT 	 nsIContent::SetText(nsAString_internal const&, int)
14.    7 0xfffffffff0dea80f Windows NT 	 nsCSSFrameConstructor::FindFrameWithContent(nsFrameManager*, nsIFrame*, nsIContent*, nsIContent*, nsFindFrameHint*)
15.    6 0xfffffffff0dea81f Windows NT 	 nsAccessibleTreeWalker::UpdateFrame(int)
16.    6 0xfffffffff0dea807 Windows NT 	 nsINode::GetOwnerDoc()
17.    6 0xfffffffff0dea803 Windows NT 	 nsIFrame::GetOffsetTo(nsIFrame const*)
18.    4 0xfffffffff0dea817 Windows NT 	 nsIFrame::GetStyleDisplay()
19.    4 0xfffffffff0dea817 Windows NT 	 nsDisplayText::Paint(nsDisplayListBuilder*, nsIRenderingContext*)
20.    3 0xfffffffff0dea823 Mac OS X 	 nsFrameList::InsertFrames(nsIFrame*, nsIFrame*, nsFrameList&)
21.    3 0xfffffffff0dea81f Windows NT 	 PresShell::Paint(nsIView*, nsIRenderingContext*, nsRegion const&)
22.    3 0xfffffffff0dea7ff Windows NT 	 nsFrameManager::RemoveFrame(nsIFrame*, nsIAtom*, nsIFrame*)
23.    2 0xfffffffff0dea91b Windows NT 	 nsBidiPresUtils::Resolve(nsBlockFrame*, int)
24.    2 0xfffffffff0dea893 Windows NT 	 nsLayoutUtils::GetNextContinuationOrSpecialSibling(nsIFrame*)
25.    2 0xfffffffff0dea81f Windows NT 	 nsCachedStyleData::GetStyleTextReset()
26.    2 0xfffffffff0dea81f Windows NT 	 @0x0
27.    2 0xfffffffff0dea817 Windows NT 	 PresShell::CreateRenderingContext(nsIFrame*, nsIRenderingContext**)
28.    2 0xfffffffff0dea80b Windows NT 	 nsStyleContext::GetStyleDisplay()
29.    2 0xfffffffff0dea803 Windows NT 	 nsDisplayBackground::Paint(nsDisplayListBuilder*, nsIRenderingContext*)
30.    2 0xfffffffff0dea7ff Windows NT 	 nsIFrame::GetView()
31.    2 0xfffffffff0dea7ff Windows NT 	 nsAsyncInstantiateEvent::Run()
32.    2 0xfffffffff0dea7ff Windows NT 	 do_QueryFrame::operator<nsBlockFrame> nsBlockFrame*()
33.    2 0xfffffffff0dea7ff Mac OS X 	 nsFrame::HandleDrag(nsPresContext*, nsGUIEvent*, nsEventStatus*)
34.    1 0xfffffffff0dea91b Windows NT 	 nsPlaceholderFrame::GetRealFrameFor(nsIFrame*)
35.    1 0xfffffffff0dea91b Windows NT 	 IsPercentageAware
36.    1 0xfffffffff0dea857 Windows NT 	 xul.dll@0x308d86
37.    1 0xfffffffff0dea857 Windows NT 	 nsIFrame::GetFirstChild(nsIAtom*)
38.    1 0xfffffffff0dea857 Windows NT 	 nsCSSFrameConstructor::FindFrameWithContent(nsFrameManager*, nsIFrame*, nsIContent*, nsIContent*, nsFindFrameHint*)
39.    1 0xfffffffff0dea81f Windows NT 	 nsCachedStyleData::GetStyleUIReset()
40.    1 0xfffffffff0dea81f Windows NT 	 @0x1
41.    1 0xfffffffff0dea81b Windows NT 	 nsStyleBackground::Destroy(nsPresContext*)
42.    1 0xfffffffff0dea817 Windows NT 	 xul.dll@0x37f9c0
43.    1 0xfffffffff0dea817 Windows NT 	 nsWeakFrame::Init(nsIFrame*)
44.    1 0xfffffffff0dea817 Windows NT 	 nsIFrame::GetParentViewForChildFrame(nsIFrame*)
45.    1 0xfffffffff0dea817 Mac OS X 	 nsFrame::Destroy()
46.    1 0xfffffffff0dea813 Windows NT 	 xul.dll@0x818a7a
47.    1 0xfffffffff0dea813 Windows NT 	 nsGfxScrollFrameInner::InternalScrollPositionDidChange(int, int)
48.    1 0xfffffffff0dea813 Windows NT 	 nsCSSFrameConstructor::MaybeRecreateContainerForFrameRemoval(nsIFrame*, unsigned int*)
49.    1 0xfffffffff0dea813 Windows NT 	 @0x0
50.    1 0xfffffffff0dea80f Windows NT 	 xul.dll@0x30980f
51.    1 0xfffffffff0dea80f Windows NT 	 xul.dll@0x300e46
52.    1 0xfffffffff0dea80f Mac OS X 	 nsCSSFrameConstructor::FindFrameWithContent(nsFrameManager*, nsIFrame*, nsIContent*, nsIContent*, nsFindFrameHint*)
53.    1 0xfffffffff0dea80b Windows NT 	 nsRuleNode::DestroyInternal(nsRuleNode***)
54.    1 0xfffffffff0dea807 Windows NT 	 xul.dll@0x42d9d7
55.    1 0xfffffffff0dea803 Windows NT 	 nsIFrame::BuildDisplayListForChild(nsDisplayListBuilder*, nsIFrame*, nsRect const&, nsDisplayListSet const&, unsigned int)
56.    1 0xfffffffff0dea7ff Windows NT 	 xul.dll@0x2f4054
57.    1 0xfffffffff0dea7ff Windows NT 	 PresShell::FreeMisc(unsigned int, void*)
58.    1 0xfffffffff0dea7ff Windows NT 	 HashString(nsAString_internal const&)
59.    1 0xfffffffff0dea7ff Mac OS X 	 nsFrameManager::RemoveFrame(nsIFrame*, nsIAtom*, nsIFrame*)
There was a good bump in number of beta users, number of crashes, and number of  unique signatures yesterday.  

3.6b1                         #crashes  #signatures
adus

195,912	20091105-crashdata.csv	5,962	1,364
213,475	20091106-crashdata.csv	14,917	2,537


those increases yielded 14 new low volume frame poisoned crashes.


we're now up to 73 newly detected signatures related to frame poisoning in the 3.6 beta and trunk builds

here are the new ones.

35.    1 0xfffffffff0deaa67 Mac OS X 	 nsXULPopupManager::GetSubmenuWidgetChain(nsTArray<nsIWidget*>*)

36.    1 0xfffffffff0deaa4f Windows NT 	 xul.dll@0x39b1f9

39.    1 0xfffffffff0dea91b Windows NT 	 FindNextNonWhitespaceSibling

46.    1 0xfffffffff0dea81b Windows NT 	 nsDocLoader::FireOnLocationChange(nsIWebProgress*, nsIRequest*, nsIURI*)

48.    1 0xfffffffff0dea817 Windows NT 	 xul.dll@0x188751

50.    1 0xfffffffff0dea817 Windows NT 	 nsStyleContext::Mark()

52.    1 0xfffffffff0dea817 Windows NT 	 nsContainerFrame::PositionFrameView(nsIFrame*)

55.    1 0xfffffffff0dea813 Windows NT 	 nsRegion::SetToElements(unsigned int)

62.    1 0xfffffffff0dea80b Windows NT 	 nsINode::GetCurrentDoc()

64.    1 0xfffffffff0dea807 Windows NT 	 xul.dll@0x139320

66.    1 0xfffffffff0dea7ff Windows NT 	 xul.dll@0x3061cb

71.    1 0xfffffffff0dea7fb Windows NT 	 nsCursorImage::`vector deleting destructor''(unsigned int)

72.    1 0xfffffffff0dea7f7 Windows NT 	 _PR_MD_ATOMIC_DECREMENT

73.    1 0xfffffffff0de0010 Windows NT 	 nsBlockFrame::ReflowDirtyLines(nsBlockReflowState&)
Depends on: 472774
(In reply to comment #9)

> 
> For the record, I think it *would* be possible to backport the QueryFrame stuff
> + the poisoning stuff to a 3.5.x security release without breaking anything;
> but it would be a large, tedious project that I personally do not want to do. 
> Hopefully 3.6 uptake is quick enough that we don't need it.

While you may not want to do it, personally, it's your job to follow through with this if need be.  Is everyone here comfortable relying on 3.6 uptake rate being fast enough?
Depends on: 527567
martijn,  do you have reproducible test cases for any of the signatures mentioned in the comment 15 attachment?
here are the top sites where people are hitting these frame poisoned crashes.  facebook,  youtube, and lots of google.*  so it may not be obscure content.

urls reported are not very reliable yet though so this data might be misleading.


 351 http:  www.facebook.com
  27 http:  www.youtube.com
  24 http:  www.google.com
  20 https:  mail.google.com
  20 http:  apps.facebook.com
  19 http:  www.tagged.com
  14 http:  mail.google.com
  12 http:  iwiw.hu
  11 http:  www.orkut.com.br
  11 http:  www.megaupload.com
  10 http:  get.adobe.com
  10 file:  
   9 http:  www.google.fr
   8 https:  www.google.com
   8 http:  www.mediafire.com
   8 http:  beautifulpeople.com
   6 https:  addons.mozilla.org
   6 http:  www.microsoft.com
   6 http:  www.google.com.eg
   6 http:  www.google.com.br
   6 http:  www.google.cn
   5 http:  www.shufuni.com
   5 http:  www.playnow-arena.com
   5 http:  www.google.it
   5 http:  us.mg4.mail.yahoo.com
   5 http:  adserving.cpxinteractive.com
   5 http:  127.0.0.1
   4 https:  services.mozilla.com
   4 http:  www.territorioeldorado.limao.com.br
   4 http:  www.myfreecams.com
   4 http:  www.beautifulpeople.com
   4 http:  www.aoevietnam.org
   4 http:  www.9186.net
   4 http:  quatrorodas.abril.com.br
   4 http:  mail.live.com
   4 http:  downloads.ziddu.com
   4 http:  download.cnet.com
   3 https:  www.verizonwireless.com
   3 http:  www.retailmenot.com
   3 http:  www.realraptalk.com
   3 http:  www.orkut.co.in
   3 http:  www.mozilla.org
   3 http:  www.inzercia.sk
   3 http:  www.guys4men.com
   3 http:  www.google.pl
   3 http:  www.google.de
   3 http:  www.google.com.vn
   3 http:  www.google.com.au
[long tail follows]
Depends on: 528493
tomcat has been able to reproduce the crash #67 ranked signature out of orginal list in comment 5.  crash content was found on a public site with random user hitting the content.  The url was harvesting socorro, then run back though automation.   

hurray for crash test automation! ;-)

Its the nsIFrame::GetView() crash in bug 528493

hoping for more like this one in the next round of automated testing.
Chris, how many of those crashes appeared before FP landed?
jst is going to find out which crashes existed before FP landed and he's going to get the urls for those signatures in Comment 22.
Damon, if it becomes necessary, it would be pretty easy to disable the poisoning in non-debug builds, or make it subject to a hidden pref, without backing the whole thing out.
This shows which of the crash signatures listed in chofmann's attachment that have not been seen in 3.5.5, meaning they're likely new crashes. This does not include any of the crashes for which we have no function name, as with only a module name and address there's no way to know if it's a known crash or not since the address changes with releases...
(In reply to comment #22)
> Chris, how many of those crashes appeared before FP landed?

if the crash signatures are moving around it might be hard to tell how many in my list are entirely new crashes, and how many are just new signatures.  talking with jst about this he mentioned.

JohnnyStenback:  exactly so for xul.dll@0xfoobar we really don't know

that would require looking at a bunch of crash reports by hand, or figure out some tool that could help us understand signatures that move around.
one thing we could do here to reduce risk is to get a beta deployed to a large number of users that has the fix for bug 526545

then tomcat could run another url crash testing run to find the click to crash urls that might be running in the wild.

the first run with the less effective url we are collected didn't turn up much, but with the new fix we might do a much better job of finding reproducible frame poisoning crashes.

I really thing we should get FP shipped as soon as possible, but the two things we need to do to make that happen quickly is to fix the most visible crashes, and have a strategy for back porting FP to 3.5.x quickly if someone discovers a way to turn the new 3.6 crashes into 3.5/3.0 exploits.  this comment should probably go private before this bug ever gets opened up.
Marking blocker from discussion in bug 497495 -- at least the most common of these new crashes need to be dealt with to have a stable release.
Flags: blocking1.9.2+
(In reply to comment #24)
> Damon, if it becomes necessary, it would be pretty easy to disable the
> poisoning in non-debug builds, or make it subject to a hidden pref, without
> backing the whole thing out.

A hidden pref *might* be a useful diagnostic tool in any event (apart from recognizing the poisoned frame signature). Note I'm happy we are finding these bugs.
OS: Mac OS X → All
Hardware: x86 → All
Does it make sense for zwol to own this, or chofmann to own this meta and zwol to own the various crash bugs?

Also: why are we blocking on a meta?
I think choffman should own the meta - he's much more up on our process for finding these crash reports and aggregating them into bugs, than I am.  For instance, I have no idea where the URL data he's posted in several of the bugs comes from.

I think it makes sense to block on the meta, since it's the general situation that we are not comfortable with shipping with, rather than any given one of the bugs.  Based on my lack of luck reproducing them, I suspect any given one of the bugs is going to affect a relatively small user population - but the aggregate crashiness of them all is worrisome.

(Also, I would like to state for the record that I wish I hadn't signed on to this horrible albatross of a project in the first place, and the sooner I never have to think about it again, the better.  Done complaining now.)
(In reply to comment #33)
> I think choffman should own the meta - he's much more up on our process for
> finding these crash reports and aggregating them into bugs, than I am.  For
> instance, I have no idea where the URL data he's posted in several of the bugs
> comes from.
> 
> I think it makes sense to block on the meta, since it's the general situation
> that we are not comfortable with shipping with, rather than any given one of
> the bugs.  Based on my lack of luck reproducing them, I suspect any given one
> of the bugs is going to affect a relatively small user population - but the
> aggregate crashiness of them all is worrisome.

sure,  I can own this meta bug which what I think means:

1) getting agreement an evaluation process by which to measure the impact of bug and determine what to do about them

2) make sure all, or the most important, bugs we know about all get evaluated and we have some tracking mechanism for tracking the completions of the evaluations

3) getting some agreement on a criteria by which we can arrive at some shipping decisions

I made some suggestions about item #1 in one of the other dependency bugs.  I'll restate here:

we should be looking for three kinds of bugs, and pursuing fixes for each of these as we find them.

a) click to crash or low user interaction bugs
b) high volume fp crashes
and c) real simple and low risk fixes

if we fix the "a" set of bugs we reduce the risk of 3.5.x and 3.0 zero days.

if we fix the 'b' set of bugs we won't significantly back track on the stability progress made in other areas of 3.6

and if we put together several fixes for the 'c' class of bugs that will have the benefit of fixing one of the top FP crash bugs.

if every one agrees we can move on to the next stage.  does this part make sense?
for part two in comment  34 which is

> 2) make sure all, or the most important, bugs we know about all get evaluated
> and we have some tracking mechanism for tracking the completions of the
> evaluations

I made a suggestion in comment 27.  That might play havoc with the schedule, but I think it puts us on a faster path to shipping something we understand the risks of better.  Interested in comments and feedback on that suggestion....
the other complicating part of what is going on are the changes to the skip list.

those changes are going to allow us to isolate and analyze specific problems easier and faster, but its also going to complicate things in the next few days as signatures move around.   Having a good set of crash data from another high volume beta will also allow us to sort these signature and crash volume and crash distribution changes out faster.
I too wish in my heart of hearts for another beta. The recent fix for crash reporter urls didn't make b3, so we are not getting the benefit of improved crash urls from the most recent beta.
https://bugzilla.mozilla.org/show_bug.cgi?id=526853#c8  shows and example of how we are/can use step 1 of the plan in comment 34 to evaluate and make decisions about each of the fp bugs.   we just need to keep going with that process to get to a good outcome.   I hope everyone agrees.
https://bugzilla.mozilla.org/show_bug.cgi?id=526769 might be an example of 

1. c) real simple and low risk fixes  that come from code inspection and a quick look at each bug on the fp crash list.

and get the benefit of several low volume long tail crashes that add up to the same impact of fixing a top crash.
Over to layout where this belongs IMO.
Component: Security → Layout
QA Contact: toolkit → layout
I don't think we should block on a meta-bug; if there are particular prominent crashes we can block on them.

What are the top unfixed crashes that were exposed by frame poiosoning, and how do they fit in the overall crash-stats list?
list now link-a-fied to allow faster checking against crash-stats queries,  crash reports and bugs on file.
Depends on: 530877
Depends on: 530880
Flags: wanted1.9.2? → wanted1.9.2-
Depends on: 522141
(In reply to comment #41)
> I don't think we should block on a meta-bug;

I agree in general, but not this specific case of the frame poisoning problems.
The blocking on this bug is about getting though the process of evaluting FP risks.   When we have done the the evaluation process to our satisfaction, then this bug should be non-blocking.

> if there are particular prominent crashes we can block on them.

I think this agrees with point b from comment 34

>  a) click to crash or low user interaction bugs
>  b) fix high volume fp crashes
>  and c) real simple and low risk fixes found through code inspection

I also suggested 'a' and 'c' be added to the criteria.

any comments on those?

Attachment in comment 42 has the list of signatures hit by FP changes.
No longer depends on: 522141
Depends on: 530899
(In reply to comment #43)
> (In reply to comment #41)
> > I don't think we should block on a meta-bug;
> 
> I agree in general, but not this specific case of the frame poisoning problems.
> The blocking on this bug is about getting though the process of evaluting FP
> risks.   When we have done the the evaluation process to our satisfaction, then
> this bug should be non-blocking.

What do you mean by "evaluating FP risks"?

I don't see anything in this bug that comes within an order of magnitude of suggesting that we should even consider disabling frame poisoning.
If there are particular evaluation tasks, particular changes you think we should make, or particular crashes that you think should block the release, please make sure there's a bug filed on them and nominate that bug to block the release.

This bug is a metabug and there's no clear way to tell if it's fixed.  Therefore it can't be a blocker for a release.  Minusing.
Flags: blocking1.9.2+ → blocking1.9.2-
(In reply to comment #44)
> I don't see anything in this bug that comes within an order of magnitude of
> suggesting that we should even consider disabling frame poisoning.

And to expand on why I think this:  this is a drop in the bucket compared to our total crashiness.  Of the 14615 crashes yesterday on 3.6b3 (per 20091123-crashdata.csv.gz), 413 of them had "f0de" in the crash address, and of those, most were fixed, since the top 10 such signatures were:

    289 nsAccessibilityService::GetAccessible(nsIDOMNode*, nsIPresShell*, nsIWeakReference*, nsIFrame**, int*, nsIAccessible**)
     51 nsAccessibleTreeWalker::PopState()
     13 nsCOMPtr_base::assign_from_qi(nsQueryInterface, nsID const&)
      7 nsIFrame::GetStyleDisplay()
      7 nsCOMPtr_base::assign_with_AddRef(nsISupports*)
      7 nsAsyncInstantiateEvent::Run()
      6 nsXULPopupManager::HidePopupsInDocShell(nsIDocShellTreeItem*)
      5 nsINode::GetOwnerDoc()
      5 nsAccessibleTreeWalker::UpdateFrame(int)
      4 nsStyleContext::GetStyleDisplay()

The top two of these (bug 525579 and bug 527287) are fixed (and perhaps others, but I'm not counting those).  That leaves 73 crashes (413 - (289 + 51)) out of, let's (optimistically) say 10000 unfixed crashes.


I have no problem whatsoever accepting a <1% increase in total crashiness (and likely to improve over time) in exchange for a substantial improvement in security through the mitigation of most of the potentially exploitable security bugs in layout code.
(In reply to comment #44)
> 
> What do you mean by "evaluating FP risks"?
>

making sure that at least some pct. of the 241 have had some eyes on it to see if there is a quick fix, or some user comments, or tomcat url testing that would lead to the discovery of easy steps to reproduce.

I'm not sure what that pct. is, but I'll make a proposal that it be 10%, or looking at the top 24 bugs.  Maybe the number should be higher.  Right now we are half way to that 10% goal.

Now that things are in motion to do beta4 with the improved URL reporting we should have better results from a tomcat automation run testing all the crash urls from frame poisoned crashes.  I think that should be a blocking task tracked in this bug.
 
> I don't see anything in this bug that comes within an order of magnitude of
> suggesting that we should even consider disabling frame poisoning.

That's good.  I don't think disabling frame poisoning in 3.6 is an option we should be considering either.
(In reply to comment #46)
>      13 nsCOMPtr_base::assign_from_qi(nsQueryInterface, nsID const&)

This is bug 528134.  (That bug is not all of the crashes with that signature, but I think it is all or at least almost all of the crashes with that signature and a frame poisoning crash address.)
Depends on: 530965
> 187. 1 0xfffffffff0dea800 Windows NT 

seems to be the same.  see bug 530965

It appears to be a set of crashes with signature [@ nsFrameManager::ReResolveStyleContext(nsPresContext*, nsIFrame*, nsIContent*, nsStyleChangeList*, nsChangeHint, int)]

that are both FP and non-FP crashes
(In reply to comment #47)
> making sure that at least some pct. of the 241 have had some eyes on it to see
> if there is a quick fix, or some user comments, or tomcat url testing that
> would lead to the discovery of easy steps to reproduce.

Why is evaluating these crashes more important than evaluating other, more common, crashes?
Perhaps because the crashes would represent regressions, from a user's point of view?  Presumably, many of these crashes are cases where web pages now crash (safely), when they didn't used to crash at all (even though they exercised some dangerous code paths).
(In reply to comment #46)
>     289 nsAccessibilityService::GetAccessible(nsIDOMNode*, nsIPresShell*,
> nsIWeakReference*, nsIFrame**, int*, nsIAccessible**)

bug 525579

>      51 nsAccessibleTreeWalker::PopState()

bug 527287

>      13 nsCOMPtr_base::assign_from_qi(nsQueryInterface, nsID const&)

bug 528134

>       7 nsIFrame::GetStyleDisplay()

bug 530880

>       7 nsCOMPtr_base::assign_with_AddRef(nsISupports*)

bug 530899 (maybe duplicate of bug 526391?)

>       7 nsAsyncInstantiateEvent::Run()

bug 530877

>       6 nsXULPopupManager::HidePopupsInDocShell(nsIDocShellTreeItem*)

bug 526391

>       5 nsINode::GetOwnerDoc()

bug 527448, currently marked duplicate of bug 526391

>       5 nsAccessibleTreeWalker::UpdateFrame(int)

Likely fixed by the fixes to the first two?

>       4 nsStyleContext::GetStyleDisplay()

bug 526592
Looking at a week's worth of 3.6b3 poison-address crashes basically produced a very similar list of top crashes, with the addition of nsIContent::SetText(nsAString_internal const&, int), which is bug 515096.
> view?  Presumably, many of these crashes are cases where web pages now crash
> (safely), when they didn't used to crash at all (even though they exercised
> some dangerous code paths).

I don't think that's a safe presumption. It's likely that many of these issues were already casuing crashes, just with much less predictable stacks/signatures: frame poisoning makes the crash locations much more predictable and easier to track and fix!
Depends on: 531069
Depends on: 531075
from comment 52

bug 530877  (is not security closed)

>       6 nsXULPopupManager::HidePopupsInDocShell(nsIDocShellTreeItem*)

I filed a parallel bug  531075 to track the frame poisoning crashes that it looks like weave is tickling in at least some of those crash cases.  

there are plans to promote we more real soon, so that could expose more people to these crashes, and discovery of the fp address crashes.  any thoughts on if  that one be also turned into an exploit against 3.5.x?
Summary: new crashes as fall out of frame poisoning → investigate new crashes as fall out of frame poisoning before shipping 3.6
No longer depends on: 531069
(In reply to comment #54)
> > view?  Presumably, many of these crashes are cases where web pages now crash
> > (safely), when they didn't used to crash at all (even though they exercised
> > some dangerous code paths).
> 
> I don't think that's a safe presumption. It's likely that many of these issues
> were already casuing crashes, just with much less predictable
> stacks/signatures: frame poisoning makes the crash locations much more
> predictable and easier to track and fix!


What I'm hoping for in this bug is to replace words like  "presumably many" with some harder data.  

If only 20 of the 200 fp address crashes can be recognized as a problem in 3.6, figured out as pattern by security researchers, and then turned into reliable or simi-reliable crashes on 3.5.x  that could turn into "a month of frame poisoning crash bugs" zero-day fest on 3.5.

The race is on to test this theory.  I'm hoping we stay ahead of the security researchers in understanding what is, and is not, possible with the 200 fp bugs.
Depends on: 531125
(In reply to comment #56)
> If only 20 of the 200 fp address crashes can be recognized as a problem in 3.6,
> figured out as pattern by security researchers, and then turned into reliable
> or simi-reliable crashes on 3.5.x  that could turn into "a month of frame
> poisoning crash bugs" zero-day fest on 3.5.
> 
> The race is on to test this theory.  I'm hoping we stay ahead of the security
> researchers in understanding what is, and is not, possible with the 200 fp
> bugs.

Why are the crashes that show up with frame-poisoning addresses more interesting for security researchers than other crashes with interesting-looking addresses, which we've had many of publicly in our crash database for years?  (In many ways, I think they're less interesting, since the users with browsers exploitable by such bugs will start decreasing rapidly once we ship 3.6.)
They're more interesting in part because they're easier to find: they become deterministic crashes in 3.6, which are easier to find and analyze than the current non-deterministic ones.  At that point, the task of exploit development against 3.5, the hypothesis says, is not much of a burden.
Depends on: 531175
Depends on: 531284
tomcat also has a test run going with the url's listed in https://bug531440.bugzilla.mozilla.org/attachment.cgi?id=414942&t=PEwnqPLN3g

that should cover on all the 3.6b4 urls collected for the first few days of that release and tell us if we have any clicke to load and frame poisoned crash pages.
Depends on: 531952
Depends on: 532335
Depends on: 533016
Depends on: 533796
new list with only fp crashes seen by the 400k users of 3.6b4  this will weed out signatures for bugs fixed in the previous betas.   431 signatures seen over all in 3.6*, but only 146 seen in b4, so progress is being made.
(In reply to comment #61)
> Created an attachment (id=416784) [details]
> 3.6b4 fp crashes as of 2009 12 08 - 146 signatures
> 
> new list with only fp crashes seen by the 400k users of 3.6b4  this will weed
> out signatures for bugs fixed in the previous betas.   431 signatures seen over
> all in 3.6*, but only 146 seen in b4, so progress is being made.

chofmann, do you want a new url run for this ?
(In reply to comment #63)
> Created an attachment (id=416908) [details]
> 1245 urls out of 3.6b4 testing
> 

thanks much sir ! Crashcats are go :)
Depends on: 532539
The #1 crash, inDOMUtils::GetCSSStyleRules(nsIDOMElement*, nsISupportsArray**), looks like it should be fixed by bug 528134.

The #2 crash is bug 530877.

The #3 crash is bug 533016. Need more traction there.

A lot of the other stack signatures look like crashes during painting due to bad frame pointers, as if we were painting a corrupt frame tree. Bug 526216 is a randomly-reproducible crash of this type, so I want to look into that and maybe it will give us a clue about these others.
fix for bug 526216 hasn't landed on 1.9.2, but the #1 signature inDOMUtils::GetCSSStyleRules in beta 4 can be confirmed as fixed, and the numbers are generally lower in beta 5.  the ranking has shifted around a bit too.  here are the top 5 with more than 50 crashes.

1. 167 0xfffffffff0de8003  nsIFrame::GetOffsetTo(nsIFrame const*)  bug 531125
2. 136 0xfffffffff0de801f nsCachedStyleData::GetStyleDisplay() bug 532335

3. 120 0xfffffffff0de7fff nsAsyncInstantiateEvent::Run()   -- bug 530877
4. 85 0xfffffffff0de7fff nsCOMPtr_base::assign_with_AddRef(nsISupports*) | nsCOMPtr::operator=(nsIDOMRange*) | xul.dll@0x40548a  --  bug 533016

5. 79 0xfffffffff0de8017 nsIFrame::GetStyleDisplay()  -- bug 532335
Alias: PoisonFrameCrash
Depends on: 526866
(In reply to comment #68)
> Created an attachment (id=437487) [details]
> fp crash urls since the first of the year for automated and by hand testing

starting my test (and thanks Chris !)
Depends on: 559491
Depends on: 572582
Depends on: 580504
Depends on: ZDI-CAN-852
Depends on: 588158
Depends on: ZDI-CAN-883
Depends on: 604900
Depends on: 630107
Depends on: 633476
Depends on: 632828
Depends on: 635329
Depends on: 655451
Group: core-security
Summary: investigate new crashes as fall out of frame poisoning before shipping 3.6 → [meta] Crashes at the Poison Frame address
Depends on: 682649
Depends on: 679933
Depends on: 683702
Depends on: 700112
Depends on: 802985
Depends on: 856262
Depends on: 862185
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: