Tacon, R. (2005)  'Football and Social Inclusion: Evaluating Social Policy', a research paper from the Football Government Research Centre, Birkbeck University  [online] http://www.football-research.org/docs/socialinclusion.pdf

[A lengthy review of previous attempts to measure the effects of football on social inclusion. A new approach is developed based on the notion of realist evaluation, leading to a very useful checklist right at the back of the paper pages 26 -- 28. Very useful and extensive references as well]

Government often asserts that football can promote social inclusion, or at least combat social exclusion, although there is less hard evidence to test this assertion. What is needed is some practical evaluation so that governments and local clubs can test their policies.

Social exclusion means different things. It tended to replace the notion of poverty in European discussions and legislation, and has become prominent in New Labour in the UK. Various definitions are possible -- multiple deprivation for the Social Exclusion Unit, social detachment for Giddens, best understood as a process for Castells, depending on a number of factors including social prejudices and public policies. Promoting social inclusion tends to be seen as a simple reversal of social exclusion.

Sport is similarly difficult to define and has different meanings attached to it at different times. It has been associated with the growth of civic and local identity, but also with individualization, and detraditionalisation, both of which promote exclusion. Work on the fan indicates the possibilities  [and see Giullianotti on the effects of television and consumerism]. Some policies promote active participation, while others use popular football clubs to achieve other objectives  [my example would be anti-racist objectives].

A number of attempts to pin down the benefits of sport have been undertaken, including links between sports participation and 'increased identity with local communities, increased work productivity, and reductions in anti-social behaviour' (3). Sport used to be seen primarily as economically regenerative, but the emphasis shifted to social goals such as increased tolerance and health. However, there is a need for much more hard evidence, especially after the UK government insisted on evidence-based policy making. What sort of evidence is another matter -- long-term? Evidence to justify existing policies? Evidence to examine all the objectives or just the key ones? Theoretical incoherence may make the problem more acute.

New Labour has become worried about commercialism as well, and tension between the sporting and economic goals of the clubs, which has been prominent in some legal disputes about promoting fair competition. Football clubs have shifted towards profit-making activities, and there may now be an need to demonstrate some cultural justification to avoid further legislation. At the same time, business failures have increased among football clubs, which has led some critics to proposed new forms of ownership, such as community ownership. There is also the crisis induced by dependence on television revenue, and the divisions this has produced between Premier League and other clubs. This means some clubs need to focus even more on match-day revenue, and the relationships with supporters and local communities. This in turn has led to an interest in supporter involvement and even ownership:  'Supporters' Trusts have been established at more than 120 clubs since September 2000' (8), and they typically support social objectives such as increasing participation and antiracism.

These policies indicate the growing importance of  'community', although there is little attempt at a definition or at gathering evidence. There have been some Football in the Community schemes, originally designed to combat hooliganism, although again the evaluation has been overlooked. Now however, there seems be an interest in formal evaluation. 

There have been different forms of evaluation, but also problems with them. Classically, evaluation has been undertaken with 'milestones, outputs and outcomes' (10)  (milestones are stages provided by funding agencies). Outcomes, the longer term benefits, are the most difficult to evaluate. Thus Collins et al (1999) , doing a review for PAT10, found only 11 studies that looked at outcomes rigorously, many that did no evaluation at all , and even the 11 used different measures and methods  [details on page 10]. Another survey of directors of leisure and sport providers found reluctance to suggest a clear link between sport and community development. There is much anecdotal evidence, however.

Evaluation is low profile, with insufficient resources and thought. Often, it is not funded adequately by sponsors, who typically want short-term results and positive findings. Many of the actual projects last only for three years, making any long-term benefit difficult to detect. The same goes for the short duration of government policy and ministerial enthusiasm. Staff are also employed on the projects short-term, and tend to not have particular expertise in evaluation: there is often no  'culture of evaluation'  (12). Generally, it is important to make sure that people on the team have evaluation skills.

Measures are often inadequate, and usually unrelated to any other data as reference. Long-term effects are hard to study with the mobile population. Long-term effects may be  observable somewhere else , with 'examples in schools, or work places' (12). There is a problem in choosing definitions of terms like self esteem, and in controlling other variables.

There have been several recent attempts to produce better evaluation, including a DCMS study on long-term impact  (D C M S 2001). [further examples pages 13-15] . There have been changes in the mode of evaluation -- collecting baseline data, and using national distributions , using tests and self report checklists, control groups, questionnaires and observation  (Playing for Success). Surveys of agencies, case-study research, projects that shops, literature searches, quantitative and qualitative outcome data  (Positive Futures).

There is still some debates about the models of evaluation being used, however, and  'pseudo-scientific models of evaluation may be inappropriate... given the complexity of the processes involved' (14): feelings in particular may not be easy to measure in quantitative terms. The same point is made about random controlled trials, borrowed from biomedical approaches.

One approach is  'realist evaluation, which has its ontological foundation in critical realism' (15). Critical realism makes a distinction between 'real, actual and empirical domains' (15)  [citing Bhaskar]. Reality is at a level which underlies empirical observable events. Further,  'mental processes are taken to have causal power... and... human agency interacts with social structures in a continuous process of structuration' (15 - 16). As a result, explanation involves  '" retroduction... [moving from]... a conception of some phenomenon of interest to a conception of some totally different type of thing, mechanism, structure or condition, that... is responsible for the given phenomenon"  (Lawson 1997)' [what others have called transcendental deduction]. In this way, social exclusion and inclusion might be seen as underlining the symptoms that are being described in conventional evaluation: there may be further mechanisms and structures underlying 'the observable indicators of exclusion that are the focus of many definitions '  (16). Pawson and Tilley  (1997) seem to be key figures .

Realist evaluation works by seeing seeing how human subjects involved actually structurate  'the opportunities and suggestions provided' (16)  [very like Giddens with rules and resources?]. These mechanisms need to be studied in their own right and not just as variables or inputs. The techniques seems quite good at criticising naive positivist approaches, [and there are even hints of ANT and its vocabulary of black boxes -- 17]. Pawson and Tilley offer a critique of the usual scientific model by looking for  'mechanisms, contexts and outcomes' (later called C M O ) [and what looks like interests -- 'what might work for whom and in what circumstances'-- see the diagram on page 18]. However, there are still levels of analysis to consider, especially more macro forces that might play a part. However, there is much support for the model, which includes seeing statistical analysis not a self sufficient, but as a good description of the mechanism at the empirical level.

Applying this to the traditional approach, the vocabulary of milestones, outputs and outcomes implies some inexorable progress towards social betterment. However, critical realists would suggest that 'participation alone cannot provide evidence for the benefits' (19). Instead, 'programmes will only work for certain individuals via particular mechanisms in specific contexts' (20). This variability has been recognized in some recent statements, including ones which focus on the conditions required for sport to have desirable outcomes, or the  '"associated processes and experiences which underpin successful initiatives"' (20, quoting Coalter 2001). The authors claim that realist evaluation can do better  [presumably by demonstrating the importance of processes and contexts?].

Some examples employ realist evaluation in social work practice, which evaluate the extent to which programmes achieve self-fulfilment, and other familiar outcomes such as tolerance, a reduction of offending and so on. These emphasis has been on the importance of context, and in  'knowing what works for whom in what circumstances' (21). Such evaluations in social work are much better funded and grounded, however.

There are some more specific studies in which realist evaluation has been used  'implicitly or explicitly' (22):

Pawson and Tilley (1997), evaluated the effectiveness of the housing management programme in reducing crime. They outlined possible contexts, mechanisms and outcomes, and used a variety of methods to observe and evaluate the programme. They then used their observations to refine the general theory.

[Other models are discussed pages 23 to 25. Some of them have some interesting problems, for example in pursuing evaluation of this kind in projects designed to promote particular outcomes. Some pointed to a number of mechanisms producing the outcome, not just the chosen one. One turns on a study of the effectiveness of health action in Plymouth: here, project workers were asked to identify causal mechanisms and to contribute to theories of change, and some problems emerged including different agencies at work, and variable contact between projects and clients. Positive Futures is the best example, although it is not explicit in its use of realist evaluation. Nevertheless, there is a recognition of relations between structure and agency, and the use of quantitative and qualitative measures, the identification of mechanisms such as the need to build trust, and a way of using results of evaluation to improve design -- 25].

[Analysis results in their own particular template, based on stages in realist evaluation and describing techniques that might be used. The full template spreads over three pages -- 26 - 28 -- and seems very sensible. For example it invites people to construct theories of how the project works using previous research and evaluations, focusing especially on mechanisms, contexts and outcomes, then to develop specific hypotheses, and then to gather data using a variety of methods. Then the programme is fed back into the development of the project -- successful mechanisms are identified, compared with theory, new theories and hypotheses are developed, knowledge of mechanisms is deepened and used to train staff].

Overall, there is a need for rigorous evaluation of football based projects, to provide evidence to test the frequent claims of their effectiveness. Realist evaluation would be a good way forward. Results of specific projects might be used to design better projects, and guidance offered to policy makers. Methods will have to be refined, and should be guided by some underlying theories such as realist evaluation.

